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ABSTRACT 

This  thesis  presents  a  design  for  a  system 
initialization  mechanism  for  a  multiple  processor  system. 
The  design  is  based  upon  a  system  of  microprocessors 
(specifically  the  Intel  8066)  being  used  with  a  set  of 
application  processes,  as  is  common  in  many  real-time 
processing  applications.  The  design  is  based  upon  the 
concepts  of  explicit  communicating  processes  and  explicit 
memory  segmentation-  although  it  does  not  require  full 
hardware  segmentation. 

With  the  *oal  of  simplifying  the  system  initialization 
function,  this  thesis  segregates  the  required  initialization 
actions  into  three  distinct  phases.  The  specific  phase  for 
each  action  is  determined  by  which  phase  provides  the  most 
supportive  environment  for  that  particular  action. 

While  the  initialization  mechanism  described  in  this 
thesis  was  developed  for  a  particular  real-time  application, 
the  design  concepts  described  are  applicable  to  a  variety  of 
hardware  and  operating  system  configurations. 
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I.  INTRODUCTION 

A.  OBJECTIVES 

System  initialization  is  the  method  used  to  get  an 
operating  system  loaded  and  running  on  a  computer  system. 
This  is  a  recurring  requirement  that  must  he  accomplished 
each  time  the  computer  is  powered  up  and  each  time  the  user 
wishes  to  change  from  one  operating  system  to  another.  This 
thesis  presents  a  versatile,  simple  to  understand,  and 
widely  applicable  system  initialization  mechanism  based  on  a 
careful  sequencing  of  the  initialization  activities.  These 
activities  will  be  performed  in  one  of  the  three  system 
initialization  phases  addressed  in  this  thesis  based  upon 
which  phase  provides  the  most  supportive  environment  for 
each  particular  activity. 

Traditionally,  operating  system  designers  have  ignored 
the  system  initialization  problem  until  the  final 
development  stages.  As  a  result,  most  existing  system 
initialization  schemes  are  rather  ad-hoc,  using  a  mass  of 
"special  case"  activities  to  accomplish  initialization.  This 
thesis  addresses  these  problems  by  providing  a  framework  for 
a  simple  system  initialization  process  that  can  be  used  with 
a  variety  of  hardware  and  operating  system  configurations. 
The  approach  in  this  thesis  is  to  make  the  system 
initial i za tion  mechanism  appear  as  much  like  a  normal 
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applications  program  as  possible,  and  thus  use  the  operating 
system  services  to  the  fullest  extent.  This  approach  is  made 
possible  by  two  operating  system  concepts  that  are  being 
used  in  many  current  operating  systems  on  large  mainframe 
and  minicomputers,  but  have  only  recently  been  introduced  in 
the  microprocessor  arena.  The  first  is  the  concept  of 
segmented  memory.  The  second  is  the  concept  of  asynchronous 
processes,  including  an  "idle  process"  so  that  the  system 
always  "comes  to  rest"  in  a  state  that  is  easily  created  and 
controlled.  These  two  concepts  permit  the  initialization 
mechanism  to  avoid  the  special  cases  and  ad-hoc  methods  used 
in  so  many  existing  mechanisms. 

B.  MOTIVATION 

For  several  years,  the  Solid  State  Laboratory  at  the 
Naval  Postgraduate  School  has  been  conducting  research  in 
the  imape  processing  area.  A  relatively  recent  area  of 
research  has  been  in  the  development  of  "smart  sensors"  for 
missile  guidance,  radarf  surveillance,  and  other  image 
processing  applications  [l]  .  Current  sensor  platforms  relay 
massive  amounts  of  raw  data  to  ground-based  processing 
centers.  The  smart  sensor  will  provide  on-board  processing 
of  collected  data  such  that  only  the  initial  processed  image 
and  periodic  updates  need  be  downlinked  to  the  surface. 
Clearly,  a  smart  sensor  will  require  on-board  electronics  to 
do  the  data  processing. 
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Several  Naval  Postgraduate  School  theses,  under  the 
supervision  o?  Professor  T.  P.  Tao,  have  contributed  to  the 
development  of  the  smart  sensor.  In  19??,  Yehoshua  [2]  and 
Svenor  [3]  developed  filter  designs  to  improve  Infrared 
background  clutter  suppression.  In  19?8,  Hilmers  [4]  began 
processing  real-world  infrared  images.  All  the  early 
computer  processing  was  done  on  an  IBP-36C  computer  system. 
In  19?9,  Celik  [5]  developed  a  simulation  program  on  a 
Digital  Equipment  Corporation  (DEC)  LSI-11  microcomputer  in 
an  attempt  to  marry  current  hardware  and  software  research 
efforts.  Due  to  its  limited  primary  memory  and  slow 
processing  speed,  however,  the  LSI-11  proved  inadequate  for 
anything  hut  simulation  and  experimentation.  This  spawned 
additional  research  in  the  area  of  microprocessors  and 
microcomputer  architecture.  In  late  1979,  Brenner  [6] 
presented  a  multiple  microprocessor  system  design,  using 
commercially  available,  off-the-shelf  components,  that  could 
process  the  algorithms  developed  in  earlier  research  and 
also  provide  real-time,  or  near  real-time,  system  response. 

Eefore  that  goal  could  be  reached,  however,  an  operating 
system  was  required  to  control  the  operation  of  the  computer 
system.  This  operating  system  would  provide  an  interface 
between  the  computer  hardware  and  the  user.  The  operating 
system  concepts  used  were  based  on  the  Multics  operating 
system  [13,1?] .  The  basic  microcomputer  operating  system 
design  was  developed  by  O'Connell  and  Richardson  [IS]  .  V.  J. 
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Wasson  [7]  refined  and  implemented  the  basic  core,  or 
kernel,  of  the  operating  system.  The  system  initialization 
design  presented  ir.  this  thesis  was  developed  concurrently 
with  the  kernel  of  the  operating  system. 

C.  TERMS  EXPLAINED 

In  order  to  facilitate  the  discussion  of  system 
initializtion ,  a  few  terms  should  he  clearly  understood. 

1.  Operating  System 

The  operating  system  is  that  set  of  program  modules 
within  a  computer  system  that  govern  the  utilization  of 
computer  resources  [S] .  These  resources  can  he  grouped  into 
four  major  categories:  processors,  memory,  external 

Input/Output  (I/O)  devices,  and  the  secondary  storage  that 
contains  the  programs  and  data. 

2.  Process 

This  thesis  will  refer  to  the  word  "process"  as  the 
internal  representation  of  a  computational  task.  Each 
process  can  he  uniquely  characterized  by  its  execution  point 
(viz.,  the  state  of  its  processor  registers),  and  its 
address  space  (viz.,  the  memory  accessible  to  that  process). 
Since  only  one  process  can  he  running  on  a  physical 
processor  at  a  time,  the  operating  system  will  multiplex  a 
number  of  processes  onto  each  processor.  While  one  process 
is  running,  the  other  processes  will  he  waiting  their  turns 
to  he  scheduled  and  run.  But,  when  viewed  in  the  long  term. 
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each  process  can  he  seen  as  proceeding  through  its  execution 
[9] .  This  is  consistent  with  Saltzer's  definition  of  a 
process  as  a  program  in  execution  on  a  pseudo-processor 

[ie]. 

3.  Hardware  Configuration 

The  hardware  configuration  is  defined  as  that  set  of 
hardware  components,  or  modules,  present  in  the  system.  For 
example,  processors  and  memory  modules  are  parts  of  the 
hardware  configuration. 

4.  Software  Conf  Igura  tlon 

The  software  configuration  is  made  up  of  the 
processes,  system  tables,  and  system  parameters.  For 
example,  the  number  of  processes  allowed  in  the  system  at  a 
time  would  be  considered  a  part  of  the  software 
configuration . 

5.  System  Configuration 

The  system  configuration  will  be  the  combination  of 
the  hardware  configuration  and  the  software  configuration. 

6.  Application 

An  application  is  defined  as  a  program  that  causes 
the  computer  system  to  perform  some  useful  work. 

7.  Virtual  Environment 

A  key  concept  in  this  thesis  is  that  of  the  virtual 
machine  environment.  Briefly,  virtualization  results  in  a 
hierarchy  of  levels  of  abstraction,  each  building  upon  the 
facilities  provided  by  the  previous  level.  If  the  computer 


14 


hardware  Is  considered  as  the  lowest  level,  then  the  traffic 
controller,  or  processor  scheduler,  could  he  the  next  higher 
level  and  the  applications  programs  could  he  the  highest 
level.  Thus  each  level  of  abstraction  runs  on  the  virtual 
machine  provided  by  the  lower  levels  of  abstraction,  and 
each  level  becomes  a  part  of  the  virtual  machine  seen  by 
higher  levels. 

S .  Core  Image 

A  core  image  will  be  described  as  an  exact 
representation  of  a  sequence  of  instructions  and  their 
associated  data  structures  exactly  as  they  would  appear  in 
primary  memory  Just  prior  to  execution,  but  residing  on  some 
secondary  storage  medium.  This  term  is  somewhat  of  an 


anachronism,  since  core  memory  has  been  replaced  by 
semiconduc tor  memory  in  most  modern  computer  systems,  but  it 
is  descriptive  of  the  concept,  and  will  he  used  extensively 
throughout  this  thesis. 

9.  System  Initialization  Phases 


In  one  of  the  few  publications  dealing  with  system 
initialization,  Lunlevski  [11]  views  the  system 
initialization  functions  with  respect  to  three  phases,  or 
time  periods.  This  thesis  follows  that  same  approach, 
a.  System  Generation  Time 

The  bootload  medium  (viz.,  a  core  image  of  the 
operating  system)  is  created  at  system  generation  time.  This 
normally  occurs  during  a  previous  session  of  system 
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operation,  or  is  done  on  a  seperate  development  computer 
system. 

b.  Bootload  Time 

Bootload  time  is  when  the  lowest  level  of  the 
operating  system  is  actually  loaded  into  the  primary  memory 
and  its  system  parameters  and  tables  initialized. 

c.  Run  time 

The  period  following  hootload  time,  when  the 
operating  system  programs  are  running  normally,  is  called 
run  time. 

1?.  multiprogramming 

This  term  describes  a  system  in  which  two  or  more 
processes  can  be  in  one  of  several  "states  of  execution"  at 
one  time.  A  process  is  in  a  state  of  execution  if  it  has 
been  started  but  has  not  yet  been  completed  or  terminated  by 
an  error  condition  [8].  In  this  thesis,  a  process  is  said  to 
be  "running"  if  it  is  assigned  a  physical  processor  and  its 
instructions  are  being  executed.  A  process  is  "ready”  if  it 
could  run,  but  is  not  currently  assigned  a  physical 
processor.  A  process  is  "blocked"  if  it  is  waiting  for  some 
event  to  occur  (e.g.,  an  I/O  operation  to  complete  cr  the 
completion  of  some  action  by  another  process). 

11 .  Multiprocessing 

This  term  implies  that  more  than  one  processing 
unit  is  present  in  the  hardware  configuration. 
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Multiprocessing  is  used  to  achieve  greater  processing  power, 
reliability,  and  economies  of  scale. 

12.  The  Bootload  Program 

A  bootload  program  is  a  simple  program  written  to 
run  on  bare  hardware.  The  bootload  program  is  typically 
stored  in  read-only  memory  (?0M),  although  it  may  be 
extended  by  a  "bootstrap"  program  read  in  from  a  fixed 
location  in  secondary  storage.  It  is  used  to  read  the  core 
ima<re  of  the  base  layer  of  the  operating  system  from 
secondary  storage,  load  it  into  the  computer  s  primary 
memory,  and  get  the  operating  svstem  running. 

13.  The  Loader  Process 

The  loader  process  is  one  of  the  modules  that  are 
loaded  in  with  the  base  layer  of  the  operating  system.  It  is 
similar  in  function  to  the  bootload  program,  but  it  is  used 
to  load  the  higher  layers  of  the  operating  system  and  the 
application  programs.  The  primary  difference  is  that  the 
loader  process  is  used  at  run  time,  and  makes  use  of  the 
operating  system  functions  and  services  provided  by  the  base 
layer . 

D.  GENERAL  DISCUSSION 

In  general,  the  objective  of  system  initialization  is  to 
get  the  operating  system  loaded  into  primary  memory  and 
running  so  that  it  can  provide  the  support  facilities 
necessary  to  run  applications  programs.  This  procedure  is 
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carried  out  in  three  basic  steps  that  correspond  to  the 
three  system  initialization  phases  above.  First  of  all,  the 
bootload  program  and  the  core  image  of  the  operating  sysyem 
are  developed.  This  phase  occurs  prior  to,  and  somewhat, 
independent  of,  the  next  two  steps. 

The  bootload  program  is  executed  in  phase  two  of  system 
initialization.  Its  purpose  is  to  read  the  base  layer  of  the 
operating  system  from  seme  secondary  storage  medium  (e.g., 
magnetic  tape  or  disc)  and  to  load  the  data  that  it  reads 
into  primary  memory.  The  primary  memory  addresses  are  either 
determined  by  the  loader  or  are  encoded  in  the  data.  The 
secondary  storage  medium  will  contain  the  operating  system 
code  and  data  structures.  This  second  phase  also  involves 
some  preprocessing  of  the  core  image  data  in  order  that  the 
loader  may  initialize  the  processor  registers  and  some 
operating  system  data  structures  in  preperation  for  running 
the  operating  system  programs.  For  example,  the  core  image, 
as  it  exists  on  secondary  storage,  contains  load  addresses 
and  some  key  processor  register  values.  The  bootload  program 
must  strip  off  this  information  and  use  it  to  initialize  the 
registers  and  data  structures  as  mentioned  above.  The 
details  of  the  bootload  program  will  be  discussed  further  in 
Chapter  III. 

The  last  phase  of  initialization  occurs  when  the 
bootload  progam  passes  control  to  the  first  executable 
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statement  In  the  operating  system  code.  At  this  point,  the 
operating  system  will  begin  its  normal  execution. 

It  is  a  basic  premise  of  this  thesis  that  actions 
performed  during  system  generation  time  or  run  time  are 
inherently  simpler  than  the  same  action  performed  during  the 
bootload  phase.  Therefore,  this  thesis  takes  the  position 
that  the  entire  system  initialization  process  can  be  greatly 
simplified  if  the  core  image  produced  in  system  generation 
is  as  complete  as  possible,  thereby  reducing  the  amount  of 
processing  required  at  bootload  time.  The  Justification  for 
this  line  of  reasoning  should  become  clear  in  the  following 
chapter. 

With  the  layered  approach  to  system  generation  provided 
by  the  virtual  environment  concept,  the  most  difficult  task 
faced  in  system  initialization  is  the  bootloading  of  the 
base  level  of  the  operating  system.  Once  this  has  been 
accomplished,  the  initialization  process  can  take  advantage 
of  the  services  provided  by  this  base  layer  to  carry  out  the 
remainder  of  its  activities.  As  subsequent  layers  are 
initialized,  more  and  more  services  become  available  and  the 
virtual  machine  seen  by  the  system  initialization  process 
becomes  increasingly  powerful. 

E.  HIGH  LEVEL  LANGUAGE  PROGRAMMING 

Since  simplicity  and  general  applicability  are  tvc  goals 
of  this  thesis,  the  design  described  herein  is  oriented 
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almost  totally  towards  a  high  level  programming  language 
(PL/P).  The  motivation  for  this  decision  came  from  several 
sources.  Nelson  [12]  reported  a  three-to-one  Increase  in 
productivity  when  a  high  level  language  was  used  instead  of 
assembly  language.  While  the  standard  deviations  he  reported 
were  large,  the  evidence  was  overwhelmingly  in  favor  of  high 
level  languages.  Corbato,  Saltzer,  and  Clingen  [12] 
attribute  much  of  the  success  of  the  Multics  development  to 
the  use  of  a  high  level  programming  language  (PL/l)  and  the 
interactive  debugging  that  Multics  provided.  Rrooks  [14] 
agrees  that  the  increases  in  productivity  and  debugging 
speed  are  overwhelming  reasons  to  use  a  high  level  language 
in  the  design  and  implementation  of  systems  programs.  A  high 
level  language  will  also  serve  as  a  communication  tool  for 
anyone  who  reads  the  program  listing.  The  logical  structure 
of  the  program  can  be  reflected  in  the  listing,  and  comments 
may  be  inserted  at  will  to  clarify  potentially  confusing 
portions  of  the  program. 

F.  STRUCTURE  OF  THE  THESIS 

With  this  chapter  as  an  introduction,  Chapter  II  will 
present  an  overview  of  the  environment  in  which  this  design 
was  developed  and  lmplemnted.  This  overview  will  incude  the 
hardware  used  in  the  project  and  a  brief  look  at  the 
philosophy  used  in  the  development  of  the  operating  system. 
Chapter  III  presents  the  detailed  design  and  proposed 
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implementation.  Chapter  IV  presents  the  conclusions  reached 
during  the  design  o?  this  system  initialization  mechanism, 
and  some  recommendations  for  future  research  that  might  use 
this  design  as  a  base.' 

G.  SUMMARY 

This  chapter  has  provided  the  reader  with  the  objectives 
that  this  thesis  hopes  to  accomplish,  and  with  the 
motivation  behind  the  thesis  project.  It  has  introduced  the 
reader  to  system  initialization  by  defining  some  of  the 
terms  used  in  the  thesis,  and  by  presenting  a  brief  general 
discussion  of  the  initialization  function.  This  chapter  has 
also  explained  the  motivation  behind  the  almos t-exclusi ve 
use  of  high-level  language  programming  in  the  development  of 
the  programs  for  this  thesis. 


II.  THE  DEVELOPMENT  ENVIRONMEN 


A.  OBJECTIVE 


This  chapter  will  provide  a  detailed  descriptior  of  the 
environment  In  which  the  system  Initialization  mechanism  was 
developed.  It  will  include  an  explanation  of  the  hardware 
used  to  develop  the  design  for  the  mechanism,  some  basic 
concepts  from  the  operating  system  it  is  designed  to 
initialize,  and  some  of  the  assumptions  made  about  the 
multiple  microcomputer  system  and  the  smart  sensor 
algorithms  that  the  system  is  designed  to  run. 


B.  HARDWARE 

As  discussed  in  the  background  section  of  Chapter  I, 
when  it  was  determined  that  the  single  ISI-11  microcomputer 
would  handle  the  processing  requirements  for  a  srart  sensor 
system,  but  would  not  achieve  the  desired  speeds,  the  search 
for  a  replacement  processor  suitable  for  use  in  a 
multiple-processor  computer  system  began.  The  decision  was 
made  to  focus  the  search  on  currently  available  commercial 
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selection  criteria  considered.  The  search  initially 
identified  the  DEC  LSI-11/23,  the  Intel  8066,  the  Motorola 
68eee,  and  the  Zilog  Z8000  as  candidates. 

The  decision  to  use  the  Intel  8066  was  finally  made, 
based  upon  its  performance  specifications,  past  experience 
with  other  Intel  products,  and  the  fact  that  it  was 
commercially  packaged  for  multiprocessor  applications.  The 
fact  that  it  was  available  off-the-shelf  and  supported  with 
a  full  product  line  of  support  software  and  peripheral 
equipment  also  had  an  impact  on  the  selection. 

The  Intel  6066  is  a  16-bit,  EMOS  technology 
microprocessor.  It  has  a  clock  rate  of  5  Megahertz  ( MEz ) .  By 
combining  a  base  address  with  an  offset,  it  can  directly 
access  a  full  Megabyte  of  primary  memory.  It  is  capable  of 
both  8-bit  and  16-bit  signed  or  unsigned  arithmetic  in 
binary  or  decimal  bases,  including  multiply  and  divide  [15]. 
It  achieves  its  relatively  high  speed  through  a  combination 
of  its  HMOS  technology  and  some  architectural  advancements. 
A  major  factor  in  its  architecture  is  the  overlapping  of 
instruction  fetch  and  instruction  execution.  An  instruction 
stream  byte  queue  provides  for  pre-fetching  up  to  six  bytes 
of  instruction  during  the  execution  of  previously  fetched 
instructions.  The  exact  number  of  instructions  prefetched  is 
a  function  of  the  instructions  being  fetched,  since  they 
vary  in  length. 
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The  one  megabyte  memory  accessible  to  the  9086  is  viewed 
as  a  *roup  of  segments  that  are  defined  by  the  application. 
A  segment  can  be  described  as  a  logical  unit  of  memory  that 
may  be  up  to  64  kilobytes  long  [15].  Note  that  the  segment 
length  boundary  is  not  enforced  by  the  hardware.  Effective 
address  calculations  are  done  with  modulo  64k  addition,  so 
attempts  to  access  past  this  boundary  result  in 
"wrap-around’'  to  the  beginning  of  the  segment.  Each  segment 
is  a  set  of  contiguous  locations  and  is  an  independent, 
separately  addressable  unit.  As  seen  in  figure  II-l,  at  the 
hardware  level  segments  may  be  totally  disjoint,  adjacent, 
partially  overlapped,  or  fully  overlapped.  However,  the 
integrity  of  this  operating  system  design  demands  that  two 
segments  of  a  process  can  never  overlap.  To  access  a 
particular  memory  location,  it  is  necessary  to  provide  the 
base  address  (viz.,  in  a  processor  base  register)  of  the 
segment  that  contains  that  location,  and  the  offset  from  the 
base  address  to  that  location.  The  base  address  must  be  an 
even  multiple  of  16.  To  obtain  the  effective  address,  given 
the  base  and  offset,  the  £066  performs  a  left  shift  of  four 
places  on  the  base  address,  zero-filling  from  the  low-order 
end.  This  shifted  base  register  value  is  the  added  to  the 
address  offset.  This  results  in  a  20-bit  effective  address, 
and  hence  the  one  megabyte  address  space.  Figure  I 1—2 
represents  the  address-formation  process. 
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Hardware  Segmentation  in  the  8086 
Figure  1 1— 1 
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The  processor  has  direct  access  to  four  segments  at  any 
one  time  [15],  Their  base  addresses,  or  starting  locations 
are  contained  in  four  segment  registers.  The  Code  Segment 
(CS)  register  points  to  the  base  of  the  code  segment,  from 
which  instuctions  are  fetched.  The  value  contained  in  the 
Instruction  Pointer  (IP)  register  gives  the  offset,  from  the 
CS  value,  to  the  next  instruction  to  be  executed.  The  Stack 
Segment  (SS)  register  is  a  pointer  to  the  base  of  the  stack 
segment.  Stack  operations  are  performed  on  the  locations  in 
this  segment.  The  Data  Segment  (DS)  register  points  to  the 
current  data  segment,  that  is  used  to  maintain  program 
variables.  There  is  also  available  an  Extra  Segment  (ES) 
register,  that  may  point  to  an  additional  segment  used  for 
data  storage. 

Another  major  factor  in  the  selection  of  the  Intel  9086 
was  the  availability  of  the  Intel  iSEC  86/12A  single  board 
computer.  The  86/12A  is  a  complete  microcomputer  system  on 
one  6.75  by  12.2  inch  printed  circuit  board.  The  version  of 
the  86/12A  used  in  this  design  contains  a  5MHz  £086 
processor,  32K  bytes  of  random-access  memory  (RAM),  8K  bytes 
of  electrically  progammable  read-only  memory  (2FR0M), 
programmable  serial  and  parallel  I/O  interfaces,  a 
programmable  interrupt  controller,  a  real-time  clock,  and  an 
interface  to  the  Intel  Multibus  for  interconnection  to  other 
devices  [15].  At  the  hardware  level,  the  32K  bytes  of  RAM  is 
dual-ported.  That  is,  the  RAM  on  one  board  in  a 
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multi-computer  system  is  available  to  all  the  other 
processors  in  that  system.  The  on-hoard  RAM  of  each  86/12A 
is  actually  seen  as  two  address  spaces  in  a  multi-computer 
configuration.  However  the  operating  system  design  does  not 
support,  nor  can  it  tolerate,  a  segment  having  two 
addresses.  The  dual  port  feature  is  used  during  system 
initialization,  hut  this  is  a  temporary  measure,  being  used 
until  a  suitable  bootload  program  is  available  in  the  EPP.OM. 
The  processor  on  the  same  board  sees  its  local  memory  as  the 
address  space  between  00000E  and  32000H.  The  other  boards  in 
the  system  see  that  same  RAM  as  a  different  address  space! 
the  exact  address  range  depends  on  the  board  on  which  it 
resides  and  the  strapping  options  employed  in  the  hardware. 
Figure  II-3  shows  a  system  diagram  of  the  iSBC  86/12A  single 
board  computer. 


The  hardware  conf igura tion  of  the  multiple 
microprocessor  system  used  in  this  thesis  project  is  shown 
in  figure  II-4.  It  is  housed  in  an  Intel  ICS-80  chassis, 
which  provides  the  power  supplies,  cooling  fans,  and  the 
Multibus  connections.  System  components  include  a  Mu-Pro 
128K  byte  error  detecting/error  correcting  RAM  board  and  up 
to  six  iSBC  S6/12A's.  Near-term  hardware  enhancements 
include  a  Multibus  interface  to  a  hard  disc  system  for 
on-line  secondary  storage,  and  an  image  display  device  for 
smart  sensor  software  development. 
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Program  development  was  done  on  an  Intel  INTELLEC-II 
microcomputer  development  system  (:dDS).  Since  no  secondary 
storage  was  available  on  the  multiple  mi crocomnuter  system, 
the  PDS  system  was  used  to  simulate  secondary  storage  for 
the  86/12A's.  A  program  written  for  the  MDS  provides 
communication  to  one  of  the  multiple  microcomputers  via  a 
serial-port-to-serial-port  connection.  The  bootload  program 
and  the  operating  system  loader  view  the  port  just  as  if  it 
were  the  interface  to  a  secondary  storage  device. 

As  shown  in  figure  II-4,  the  two  computer  systems  are 
also  connected  by  an  Intel  ICE-86  in-circuit  emulator  f!6j. 
The  ICE-86  is  used  to  aid  in  program  development.  In  this 
application,  it  is  also  used  to  load  into  the  S6/l2A's  those 
programs  that  will  eventually  reside  in  SPF.Om.  Since  the 
86/12A's  do  not  have  direct  access  to  secondary  storage  via 
the  system  bus,  the  run-time  loader  process  that  runs  on  the 
processor  connected  to  the  ^ES  via  the  serial  port  link  must 
perform  the  disc  I/O  function  and  make  the  disc  data 
available  to  the  other  loader  processors.  Vhen  the  hard  disc 
is  installed,  all  the  run-time  loader  processes  will  be 
identical.  Until  that  time,  the  method  described  above  and 
detailed  in  the  next  chapter  will  be  used  for  system 
initialization. 


C.  OPERATING  SYSTEM  BASICS 

The  operating  system  developed  for  the  microcomputer 
system  described  above  was  written  by  V.  J.  Wasson  [7]  in  a 
thesis  project  that  was  done  concurrently  with  this  thesis. 
It  uses  many  of  the  concepts  developed  for  the  Multics 
system  [17] ,  and  is  an  extension,  with  a  few  changes,  of  the 
distributed  operating  system  concepts  presented  by  O'Connell 
and  Richardson  [IS].  The  operating  system  is  intended  to 
provide  an  interface  between  the  user  and  the  hardware  such 
that  the  underlying  hardware  configuration  is  made 
invisible,  or  at  least  of  no  direct  concern,  to  the  user. 
This  section  of  the  thesis  is  intended  as  a  basic 
intoduction  to  those  operating  system  concepts  and 
mechanisms  that  directly  affect  system  initialization.  The 
reader  is  referred  to  the  thesis  by  Wasson  [7]  for 
additional  details. 

1 .  Processor  Multiplexing 

This  operating  system  malces  use  of  the  virtual 
environment  concept  introduced  in  chapter  one.  This  concept 
provides  a  layered  operating  system  consisting  of  several 
levels.  At  the  lowest  level  is  the  Inner  Traffic  Controller, 
whose  function  is  to  multiplex  Saltzer's  "pseudo- 
processors"  [13]  onto  the  physical  processors  present  in  the 
system.  The  primary  data  base  used  by  the  Inner  Traffic 
Controller  is  the  Virtual  Processor  Map.  A  virtual  processor 
is  defined  as  a  "simulation"  of  a  processor  using  a  physical 
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processor  to  Interpret  the  instructions  "execute'!"  by  the 
simulated  processor.  This  data  structure  contains  the 
virtual  processor  execution  state,  its  scheduling  priority, 
interprocess  communication  information,  a  descriptor  for  its 
address  space  (represented  by  the  location  of  its  stack 
segment),  and  a  scheduling  flag  that  signifies  that  the 
processor  has  been  sent  a  virtual  preempt  interrupt  by  some 
other  virtual  processor. 

At  the  next  level  is  the  Traffic  Controller.  The  Traffic 
Controller  serves  .  to  multiplex  processes  onto  these 
pseudo-processors.  The  data  structure  used  by  the  Traffic 
Controller  is  called  the  Active  Process  Table.  This  table 
contains  the  information  needed  to  get  a  process  loaded  onto 
a  virtual  processor  and  running. 

Wasson  also  provides  a  "Gate"  module  at  the  next  level 
to  simplify  the  user's  interface  to  the  operating  system 
functions  by  providing  a  single  entry  point  to  the  lever 
levels  of  the  operating  system.  The  programmer  interfaces 
with  all  operating  system  functions  by  making  a  "call"  to 
the  gate  module  using  the  parameters  for  the  requested 
function  as  arguments  in  the  call. 

2.  The  Process  Parameter  Block 

In  addition  to  loading  the  processes  into  memory, 
system  initialization  must  also  identify  these  processes  to 
the  operating  system  so  that  they  can  be  scheduled  and  run. 
The  initialization  mechanism  described  in  this  thesis  uses  a 
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Process  Parameter  Block  to  pass  process  definition 
parameters  to  the  process  creation  function  of  the  operating 
system.  The  Process  Parameter  Block  is  a  per-processor 
artifice  into  which  each  run-time  loader  process  stores 
definition  parameters  for  the  process  being  loaded.  When  the 
operating  system  is  ready  to  create  [?]  the  process,  it 
extracts  the  parameters  fron  the  Process  Parameter  Block. 
Since  processes  are  loaded  and  created  one  at  a  time,  the 
memory  locations  in  the  parameter  block  can  be  reused  for 
each  process.  As  seen  in  figure  1 1 — 5 ,  the  Process  Parameter 
Block  contains  values  for  all  the  processor  registers 
associated  with  a  process.  Only  the  CS,  IP,  and  SS  register 
values  are  of  concern  in  this  thesis,  but  the  structure  was 
designed  to  provide  easy  expansion  during  later  research. 
The  Priority  is  used  by  the  scheduling  algorithm.  The 
Affinity  is  used  to  bind  a  process  to  a  particular 
processor . 

3.  Interprocess  Communication 

Of  primary  importance  tc  ar.y  multiprogramming  or 
multiprocessing  system  is  inter-process  communication  to 
synchronize  cooperating  processes  and  control  access  tc 
shared  resources.  This  operating  system  uses  the 
"Eventcounts  and  Sequencers"  mechanism  proposed  by  Kanodia 
and  Peed  [19].  A  summary  of  this  mechanism  is  provided  here, 
since  Interprocess  communication  is  vital  to  the  run-time 
loader  processes. 


34 


FLAGS  REGISTER 

CS 

REGISTER 

IP 

REGISTER 

■ 

other  CPU  v 

Process  Parameter  Elock 


Figure  II-5 


An  eventcount  is  a  system  variable  that  represents  a 
class  of  events  that  will  occur  in  the  system.  A  virtual 
processor  can  perform  three  primitive  operations  on 
eventcounts.  It  may  obtain  the  current  value  of  an 
eventcount  by  performing  a  READ  of  that  eventcount.  It  can 
increment  by  one  the  current  value  of  an  eventcount  by  doing 
an  ITC_ADVANCE  on  that  eventcount.  Finally,  a  virtual 
processor  may  await  the  occurrence  of  a  particular  event 
within  the  class  of  events  associated  with  an  eventcount  by 
doing  an  ITC_AVAIT  on  that  eventcount.  This  mechanism  can  be 
simply  viewed  as  using  a  counter  to  control  the  virtual 
processors.  However  it  offers  an  advantage  over  the 
traditional  semaphore  or  mechanism.  The  occurence  of  an 
event  can  be  broadcast  to  several  virtual  processors  who 
might  be  awaiting  it.  This  is  more  difficult  to  achieve  with 
more  traditional  interprocess  communication  schemes. 

D.  DEVELOPMENT  TOOLS 

As  mentioned  earlier,  all  program  development  was  done 
on  a  seperate  development  computer  system.  One  major 
advantage  of  using  such  a  system  is  the  supportive 
environment  it  provides  the  programmer.  This  support  is  in 
the  form  of  the  software  development  utilities  available 
from  the  manufacturer  of  the  development  system.  In  the 
development  of  the  system  initialization  programs  for  this 
thesis,  the  decision  was  made  to  take  full  advantage  of 


these  utility  programs.  In  addition  to  the  PI/M-86  compiler, 
three  other  utility  programs,  provided  by  Intel,  are  used 
extensively  during  the  system  generation  phase  to  create  the 
core  image  of  the  operating  system  to  be  loaded  during  the 
bootload  phase.  These  three  Intel  programs  are  called 
LINK86,  L0C66,  and  0E86  [20].  They  are  used  to  perform  the 
functions  of  linking,  locating,  and  object  file 
transformation.  Sach  of  these  functions  is  discussed  below. 
Appendix  A  contains  annotated  sample  outputs  from  the 
development  utility  programs  described  in  this  section. 

1 .  Compiling  Program  Nodules 

The  PL/P-86  compiler  [21],  in  addition  to 
translating  the  high-level  language  statements  into  8086 
machine  instructions,  offers  four  mode  options.  These 
options  let  the  programmer  determine  the  decree  of 
segmentation  to  be  used.  The  SPALL  option  tells  the  compiler 
to  produce  only  two  segments.  One  segment  combines  the  code 
sections  of  all  the  modules  in  the  program  (or  program 
section).  The  other  segment  contains  all  the  constant  and 
variable  data  and  the  stack.  This  mode  provides  the  greatest 
run-time  efficiency,  since  the  Code  Segment  register  and  the 
Data  Segment  register  (which  in  this  mode  is  identical  to 
the  Stack  Segment  register)  do  not  change  during  run-time. 
The  trade-off  is  that  the  total  size  of  each  of  these 
segments  may  not  exceed  64k  bytes,  and  that  there  is  very 
little  memory  allocation  flexibility. 
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At  the  other  extreme  is  the  LARGS  compile  mode.  In 
this  mode,  the  code  section  of  each  module  is  allocated  a 
separate  segment.  The  same  is  true  for  the  data  section  of 
each  module.  The  stack  sections  of  all  modules  are  combined 
to  form  a  single  stack  segment.  This  mode  pairs  up  the  code 
and  data  segments  of  each  module  and  insures  that  the  CS  and 
DS  registers  always  contain  the  values  from  the  same  module. 
In  this  mode,  the  total  amount  of  code  and  data  may  exceed 
64k  bytes,  but  any  one  segment  is  constrained  to  64k. 

The  COMPACT  and  MEDIUM  modes  fall  in  between  the  two 
modes  discussed,  and  offer  differing  degrees  of  segment 
seperation.  The  PL/M-S6  Compiler  Operator's  Manual  [21] 
states  that  all  modules  in  a  program  must  be  compiled  in  the 
same  mode.  To  maintain  flexibility  and  to  achieve  the  finest 
granularity  of  segment  control,  the  LARGS  mode  is  used  on 
all  operating  system  and  application  program  modules  run  on 
the  computer  system  used  for  this  thesis  project. 

2.  Combining  Program  Modules 

LINK86  is  a  program  used  to  combine  the  separately 
developed  and  compiled  program  modules  into  a  relocatable 
object  module.  When  these  separate  modules  were  compiled, 
all  addresses  were  relative  to  the  beginning  of  each  nodule. 
LINK86  accepts  these  separate  modules  as  input,  and  produces 
as  output  a  single  combined  module  whose  addresses  are 
relative  to  the  beginning  of  the  linked  output  module.  In  so 
doing,  it  resolves  all  intermodule  references  to  variables 


and  procedures.  The  availability  of  the  linker  permit*  the 
programmer  to  develop  small,  managable  program  modules  that 
can  be  debugged  and  maintained  separately,  and  then  bound 


into  a  single  module  prior  to  loading. 

*  * 

3 .  Assigning  Memory  Locations 

The  I0C86  program  takes  as  input  the  relocatable 
object  module  from  the  linker  and  produces  as  output  an 
absolute  object  module  in  which  all  addresses  have  been 
converted  to  physical  memory  locations.  It  also  produces  a 
memory  map  which  reflects  the  binding  performed  and  a  symbol 
table  that  shows  the  memory  location  assigned  to  each 
variable,  label,  and  procedure.  L0CS6  also  allows  the  user 
to  specify  exactly  where  in  memory  he  wants  the  various 
modules  of  his  program  to  be  located. 

4.  .Object  to  Hexadecimal  File  Conversion 

The  output  of  the  locator  is  an  absolute  object  file 
of  the  input.  This  object  file,  as  it  exists  on  secondary 
storage,  is  a  sequence  of  binary  digits.  Encoded  in  this 
sequence  of  binary  digits  are  all  the  machine  instructions 
and  data  necessary  to  run  the  process.  Before  execution  can 
actually  take  place,  however,  certain  key  processor 
registers  (viz.,  the  code  segment,  instruction  pointer,  and 
stack  segment  registers)  must  be  initialized  tc  their  proper 
values.  This  is  one  of  the  responsibilities  of  the 
initialization  mechanism.  These  values  are  contained  in  the 
binary  object  file.  For  the  equipment  used  in  this  thesis, 
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the  exact  format  of  the  data  in  these  object  files  was  not 
presented  in  any  documentation  available  from  the 
manufacturer.  Before  the  initialization  mechanism  can 
perform  any  programmed  action  on  the  object  files,  it  must 
have,  or  be  able  to  ascertain,  the  file  format.  Fortunately, 
there  is  a  file  conversion  propram,  called  0F£6  ,  which 
converts  this  binary  object  file  to  the  hexadecimal  ASCII 
format.  This  program,  and  the  output  file  it  produces,  is 
well  documented.  In  an  effort  to  expedite  development  of  the 
initialization  mechanism,  it  was  decided  to  use  the  0HS6 
program  and  convert  the  object  files  to  ASCII,  so  that  they 
could  more  easily  manipulated. 

There  is,  however,  a  storage  space  trade-off  to 
consider.  For  example,  the  eisht-bit  binary  value,  0120 
1111,  is  read  as  4F  in  hexadecimal.  To  encode  this  in  ASCII, 
one  byte  is  required  for  the  ASCII  representation  of  the 
4(0011  0100),  and  one  byte  is  required  for  the  F(0100  0110). 
This  representation  scheme  requires  twice  as  much  storage  in 
the  MES  as  the  binary  form,  but  because  of  limited 
documentation  it  makes  the  development  of  the  initialization 
mechanism  much  simpler.  The  bootstrap  program  and  the  loader 
process  in  this  thesis  contain  a  simple  procedure  which 
converts  this  ASCII  representation  back  to  binary  before 
storing  the  data,  so  there  is  no  waste  of  memory  in  the 
multiple  microcomputer  system. 
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E.  ASSUMPTIONS 

In  an  effort  to  expedite  work  on  the  algorithms  for  the 
smart  sensor,  several  assumptions  were  made  which  would 
simplify  the  design  of  the  initialization  mechanism  and  the 
operating  system.  This  simplification  primary  involves  the 
allocation  and  partial  completion  of  some  operating  system 
tables  used  at  run  time.  These  tables  are  used  to  describe 
to  the  operating  system  the  set  of  processes  that  will  be 
running,  and  the  hardware  configuration  that  it  will  be 
running  on.  In  a  general-user  computer  system,  some  of  these 
assumptions  might  not  be  valid.  Future  systems  programs 
developed  for  the  multiple  microcomputer  system  may  wish  to 
generalize  the  system  initialization  mechanism  to  eliminate 
some  of  these  assumptions. 

The  key  assumption  made  is  that  the  run-time  environment 
is  very  static.  That  is,  the  set  of  processes  to  be  run  and 
the  hardware  configuration  is  known  at  system  generation 
time,  and  remains  constant  during  run  time.  This  assumption 
is  justified  by  the  fact  that  the  algorithms  to  do  the 
processing  experiments  for  the  smart  sensor  system  can  be 
partitioned  into  processes  before  the  actual  processing  is 
done.  Therefore,  a  lot  of  information  about  these  processes 
can  be  determined  during  system  generation  and  passed  to  the 
bootload  and  execution  phases.  For  example,  all  the 
processes  that  will  be  executed  at  run  tire  can  be 
identified  at  system  generation  time. 
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Luniewski  [11]  also  states  that  in  order  to  simplify 
initialization  and  still  permit  dynamic  reconfiguration  [9], 
some  minimal  hardware  configuration  should  he  assured  hy  the 
initialization  mechanism.  This  is  intuitive,  since  without 
at  least  one  processor  and  some  amount  of  primary  memory,  a 
computer  can  do  no  useful  work.  Given  this  minimal  hardware 
configuration,  that  is  a  subset  of  the  largest  potential 
hardware  configuration,  the  initialization  mechanism  could 
employ  dynamic  reconfiguration  to  establish  the  actual 
hardware  configuration.  In  an  effort  to  maintain  simplicity, 
this  thesis  does  not  attempt  to  implement  dynamic 
reconfiguration.  Instead,  the  hardware  configuration  assumed 
by  the  initialization  mechanism  is  the  full  set  of  hardware 
present  in  the  system.  Since  fault-tolerance,  which  requires 
the  capability  to  dynamically  reconfigure  the  system,  is  a 
long-term  goal  of  the  smart  sensor  program,  continuing 
research  is  being  carried  out  to  give  this  initialization 
mechanism  that  capability. 

These  assumpmticns  permit  linking  and  locating  of  the 
user's  modules  with  the  same  justification  as  is  used  for 
the  operating  system  modules  -  they  do  not  change  during  the 
lifetime  of  one  initialization.  Thus  they  can  be  treated  the 
same  as  the  system  processes,  and  their  linking  and  locating 
can  be  performed  during  system  generation.  While  this 
approach  is  contrary  to  the  accepted  practice  of  delaying 
the  binding  of  logical  resources  (viz.,  memory  segments'  to 
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physical  resources  (viz.,  memory  locations),  to  enhance 
system  flexibility,  it  is  fully  Justified  in  this 
application  by  the  fact  that  the  environment  is  stable. 

The  most  important  item  of  information  that  this 
assumption  provides  is  a  partial  definition  (viz.,  the 
address  space)  of  each  process  that  will  be  run.  This  allows 
the  Process  Definition  Table,  shown  in  figure  II-6,  to  be 
created  during  the  systen  generation  phase.  The  information 
in  this  table  includes  the  process  name  (used  to  address  its 
NDS  file),  its  initial  CPU  registers,  its  stack  base  (used 
for  process  creation),  its  scheduling  priority,  and  its 
processor  affinity.  Processor  affinity  implies  that  the 
programmer  can  state  which  physical  processor  his  process 
will  be  run  on.  This  is  important  in  the  case  of  a  system 
with  dissimilar  processors.  For  example,  one  single  board 
computer  might  be  enhanced  with  a  hardware  multiplier 
circuit,  or  a  special-purpose  I/O  processor.  Also  included 
are  the  initial  CS  and  SS  register  values.  This  structure  is 
created  from  information  provided  by  the  programmer  who 
developed  each  process. 

Another  important  function  that  can  be  done  at  system 
generation  time  is  the  allocation  of  specific  segments  to 
the  local  on-board  memory  or  to  the  global  shared  RA.M. 
O'Connell  and  Richardson  [18]  present  the  design  of  an 
automated  decision  technique  for  memory  allocation.  Their 
design  calls  for  a  dynamic  memory  management  scheme.  That 
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Process  Definition  Table 
Pieure  I I —6 


is,  memory  allocation  and  deallocation  is  a  run  time 
function.  The  mechanism  proposed  in  this  thesis  performs  the 
same  memory  allocation  tasks,  hut  they  are  performed  during 
system  generation.  The  global-vs-lccal  decision  is  based  on 
the  two-by-two  decision  matrix  shown  in  figure  II-?,  and  on 
a  manually-  maintained  memory  map  that  keeps  track  of  the 
free  and  allocated  portions  of  memory.  Note  that  the  upper 
lefthand  quadrant  of  the  decision  matrix  in  figure  II-7 
shows  two  possible  choices  for  locating  shared, 
non-wri teable  segments. 

While  memory  can  be  conserved  by  locating  shared  data  in 
global  memory  to  avoid  duplication,  the  choice  in  this 
design  is  based  upon  the  desire  to  keep  as  many  segments  as 
possible  in  the  local,  on-board  memory  of  the  using 
processor.  Since  each  access  to  global  memory  requires 
exclusive  use  of  the  system  bus  for  the  duration  of  that 
access,  all  other  processors  who  might  want  to  access  global 
memory  during  this  period  are  forced  to  wait  until  the  bus 
is  free.  For  this  reason,  accesses  to  global  memory  should 
be  held  to  a  minimum.  This  can  be  accomplished  by  locating 
all  executable  (viz.,  pure)  code  and  as  much  data  as 
possible  in  the  local  RAM,  and  using  global  storage  for  only 
those  variables  and  data  that  are  shared  and  writeable. 
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F.  SUMMARY 

This  chapter  has  presented  the  environment  in  which  the 
design  described  in  this  thesis  was  developed.  It  has  shown 
the  hardware  involved,  an  overview  of  some  important 
operating  system  principles,  a  look  at  the  software 
development  utilities  used  in  system  generation,  and  the 
assumptions  made  in  the  thesis  and  their  implications.  With 
this  information  as  background,  the  thesis  will  present,  in 
the  following  chapter,  the  design  of  the  initialization 
mechanism  developed  for  this  thesis. 


III. 


HE  DESIGN 


A.  OBJECTIVE 

This  chapter  will  examine  the  different  environments  in 
which  the  three  phases  of  system  initialization  -  system 
generation,  hootloading,  and  run  time  -  take  place.  This 
discussion  will  unfold  the  design  of  the  i ni t i a  1 i za * i on 
mechanism  developed  for  this  thesis.  It  will  also  provide 
the  reader  some  insight  into  the  sequencing  of  the 
initialization  activities  and  how  the  timing  of  these 
activities  effect  the  complexity  of  the  initialization 
process.  As  this  discussion  progresses,  more  and  more 
references  will  he  made  to  operating  system  functions  and 
services.  The  reader  desiring  more  details  on  the  operating 
system,  per  s°,  should  refer  to  the  thesis  hy  Wasson  [T]  fcr 
a  more  complete  explanation. 

B.  0VER7IEV 

Chapter  I  discussed  the  purpose  of  system  initialization 
and  the  three  phases  of  initialization  used  in  this  thesis. 
Recall  that  during  the  system  generation  phase,  the  bootload 
medium,  a  core  image  of  the  base  layer  of  the  operating 
system,  was  created.  The  other  two  phases-  bootload  and  run 
time-  perform  the  loading  of  this  core  image  as  well  as  the 
remainder  of  the  operating  system  and  the  application 


programs  from  secondary  storage  Into  the  computer  system's 
primary  memory.  The  initialization  mechanism  proposed  in 
this  thesis  involves  two  seperate  loading  functions.  Recall 
that  the  hootload  program,  which  runs  on  the  hare  hardware, 
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activating  some  hardware  "Reset"  or  "Rootload"  switch. 

The  second  loading  function  is  part  of  the  distributed 
operating  system,  and  is  loaded  into  each  processor  during 
the  bootload  phase  along  with  the  base  layer  of  the 
operating  system.  This  loader  is  used  during  run  time  to 
load  the  remainder  of  the  operating  system  and  the 
application  programs  and  tc  prepare  them  to  be  scheduled  and 
run.  This  dual-loader  approach  is  common  in  most  existing 
initialization  schemes,  and  will  be  discussed  in  detail 
later  in  this  chapter. 

In  this  application,  since  only  one  processor  has  access 
to  secondary  storage  system  on  the  MI'S,  the  rur.-tire  loader 
on  this  processor  is  a  slightly  enhanced  version  of  the 
loader  process  that  runs  on  the  other  processors.  These 
enhancements  include  a  "disc  I/O"  routine,  to  allow  that 
loader  to  access  the  MBS  disc  information  sent  to  the  06/12A 
serial  port,  and  a  procedure  to  check  the  Process  Definition 
Table  to  determine  when  the  loading  function  for  this 
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process  is  conplete.  Tor  ease  of  discussion,  this  enhanced 
loader  will  he  referred  to  as  the  controlling  loader. 

C.  THE  SYSTEM  GENERATION'  SEQUENCE 

Before  the  loading  begins,  however,  there  is  some 
preliminary  work  to  be  done  that  will  simplify  the  remainder 
of  the  initialization.  This  work  is  done  during  system 
generation.  As  discussed  in  Chapter  I,  this  thesis  proposes 
that  actions  performed  at  system  generation  tire  or 
subsequently  at  run  time  are  inherently  simpler  than  that 
same  action  performed  at  bootload  tire.  This  is  due  to  the 
more  supportive  environment  available  at  system  generation 
time,  and  the  operating  system  services  available  at  run 
time.  Compare  these  to  the  bare-hardwa re  environment  at 
bootload  time,  and  the  reasoning  behind  this  premise  becomes 
clearer.  A  look  at  the  environment  in  which  system 
generation  takes  place  will  provide  additional  Justification 
for  the  proposal. 

Since  system  generation  takes  place  prior  to  the 
bootioad  and  execution  phases,  it  enjoys  the  supportive 
environment  provided  by  an  existing  operating  system  ard  ar.y 
available  utility  and  library  routines.  As  mentioned  in 
Chapter  II,  the  program  development  for  this  thesis  was 
accomplished  an  Intel  Intellec  Microcomputer  Eevelopmer.t 
System  (^TS).  The  design  proposed  in  this  thesis  makes 
extensive  use  of  the  utility  programs  available  in  that 
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environment  to  accomplish  the  system  veneration  tasks. 
System  veneration  also  enjoys  the  luxury  of  time.  The  use  of 
the  ISIS  —I  I  operating  system  in  the  MTS  serves  to  reduce  the 
complexity  of  the  boctload  and  run  time  phases. 

Because  of  the  static  nature  of  the  image  processing 
application  for  which  this  initialization  scheme  was 
designed,  the  system  generation  phase  can  make  the 
assumptions  regarding  the  hardware  conf iguratior  and  the 
nature  of  the  application  programs  discussed  in  Chapter  II. 
These  assumptions  permit  extensive  preliminary  processing  to 
be  done  in  the  more  comfortable  environment  of  system 
veneration.  This  relieves  the  later  phases,  which  occur  in 
much  less  supportive  environments,  of  the  preparatory 
processing  that  they  would  otherwise  be  required  to  perform. 

Ey  assuming  that  the  hardware  and  software 

configurations  are  known  at  at  system  generation  time,  that 
they  will  remain  constant  from  one  initialization  to  the 
next,  and  that  dynamic  reconfiguration  is  not  an  issue,  all 
memory  allocation  decisions  can  be  made  during  system 
generation.  As  discussed  in  Chapter  II,  the  decision  as  to 
whether  a  segment  should  be  placed  in  local  or  global  memory 
Is  based  on  a  two-by-two  decision  matrix.  The  main 
v  difference  between  the  Richardson  and  O'Connell  flS] 
allocation  scheme  and  the  scheme  employed  in  this  thesis  is 
that  the  scheme  used  here  is  manual,  rather  than  automated. 
This  means  that  that  memory  allocation  is  a  one-time  system 
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generation  requirement  rather  than  on  cn-going  run-time 
function.  The  O'Connell  and  Richardson  [10]  decision  matrix 
and  memory  map  are  maintained  on  paper,  hy  the  person 
generating  the  system,  rather  than  as  data  structures 
maintained  by  the  the  system  initialization  mechanism. 

The  simplest  way  to  view  system  gene  ration  is  as  a 
time-sequence  of  events,  beginning  with  program  design  and 
ending  with  the  creation  of  the  load  module,  or  core  image 
to  be  loaded.  A  detailed  examination  of  this  sequence  of 
events  will  provide  a  foundation  for  the  design  choices  made 
throughout  the  development  of  the  initialization  mechanism 
described  in  this  thesis. 

1 .  Program  Design 

The  operating  system  and  initialization  scheme 
developed  for  this  project  rely  on  the  programmer  to  design 
his  programs  to  take  full  advantage  of  the  mul  ti programming 
and  multiprocessing  capabilities  provided  by  the  hardware 
and  the  operating  system.  This  requires  that  the  programmer 
be  somewhat,  though  not  intimately,  familiar  with  the 
operating  system  philosophy  and  the  hardware  corf igura t i on . 
Given  this  basic  knowledge,  and  the  widely-accepted 
technique  cf  structured  programming,  it  is  relatively  easy 
for  the  programmer  to  design  the  required  process  structure 
into  his  programs.  This  involves  partitioning  each 
application  into  a  group  of  cooperating  processes,  and 
including  In  each  process  the  necessary  operating  system 


calls  to  provide  inter-process  synchronization,  and 
explicitly  declaring  shared  memory  segments  for 
communication  between  processes. 

In  the  development  of  each  process,  there  are  some 
simple  "ground  rules”  the  programmer  should  follow  to 
simplify  memory  allocation  and  enhance  the  performance  of 
the  system.  First,  all  data  shared  by  processes  should  be 
declared  to  be  in  segments  which  are  "external"  to  the 
application  procedure  [22].  This  implies  that  the  variable 
is  declared  and  defined  elsewhere.  Furthermore,  an  absolute 
memory  address  must  NEVER  be  coded  into  any  application. 
Second,  all  program  code  should  be  reentrant  [22].  This 
allows  each  invocation  of  a  procedure  to  store  its  variables 
on  the  process  stach.  Thus  one  invocation  will  net  overwrite 
the  variables  used  by  the  previous  invocation,  as  would  be 
the  case  if  the  variables  were  stored  as  part  of  the 
procedure  itself.  The  third  ground-rule  is  imposed  to  reduce 
the  system  bus  contention  problem  discussed  ir.  Chapter  II, 
and  rerely  requires  that  references  to  shared,  writeable 
variables  and  structures  be  held  to  a  minimum.  This 
typically  involves  a  single  read  reference  to  "input”  data 
to  the  process  and  a  single  write  reference  to  "output’  the 
data  (results).  In  particular,  shared  segments  should  never 
be  used  for  temporary  or  intermediate  results.  The  fourth 
rule  requires  that  the  programmer  segregate  writeable  and 
readable  segments  whenever  possible.  This  will  allow  finer 


granularity  in  the  memory  allocation  process.  Finally,  the 
programmer  must  declare  the  Gate  module  as  an  external 
procedure  in  every  process  to  he  run.  This  will  resolve  all 
the  external  references  to  the  operating  system  interface. 

The  programmer  is  also  given  the  res  pons ih i 1 i ty  of 
initially  identifying  his  process  to  the  operating  systen. 
Recall  that  a  process  can  be  identified  by  its  address  space 
and  its  execution  point.  Therefore,  the  programmer  must 
identify  all  the  segments  in  the  process  address  space  and 
must  identify  which  of  these  segments  will  be  modified 
(written  into)  by  this  process.  Furthermore,  the  programmer 
must  identify  the  initial  entry  point,  and  any  parameters 
passed  to  this  entry  point.  This  information  is  actually 
provided  to  the  system  operator,  who  prepares  the  Process 
Definition  Table  and  makes  the  memory  allocation  decisions 
based  on  the  full  set  of  initial  process  identification 
Information,  as  discussed  below  in  the  section  or.  memory 
allocation. 

2 .  Compilatl cn 

After  the  program  has  been  developed  and  written,  it 
must  be  compiled.  The  compiler  translates  the  high-level 
language  code  into  machine  language  instructions.  For  this 
application,  an  additional  check  is  made  at  system 
generation  time  to  insure  that  all  program  modules  have  been 
compiled  with  the  same  mode  option.  Recall  from  Chapter  II 
that  the  compiler  mode  option  determines  the  degree,  or 
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granularity,  of  the  segmentation.  This  information  must  he 
supplied  hy  the  programmer,  since  he  is  the  one  who  performs 
the  compilation. 

3 .  linking 

The  third  step  in  the  system  generation  sequence  is 
the  linking  together  of  the  various  modules  that  make  up  a 
process.  Since  the  programmer  knows  exactly  which  modules 
comprise  his  process,  he  is  in  a  position  to  pre-link  these 
modules.  Since  each  process  needs  an  interface  to  the 
operating  system,  each  process  is  also  linked  to  the  Gate 
module  previously  described.  This  implies  that  each  process 
has  declared  the  Gate  module  as  an  external  procedure. 

4 .  Memory  Allocation 

While  the  programmer  is  in  the  best  position  to 
compile  his  modules  and  link  them  into  individual  processes, 
he  is  not  in  a  position  to  know  the  degree  of  segment 
sharing  that  will  take  place.  Neither  is  he  in  a  position  to 
know  where,  in  the  system  memory,  other  programmers  might 
elect  to  lead  their  processes.  Clearly  the  memory  allocation 
decisions  must  be  centralised  to  avoid  chaos.  The  computer 
system  operator,  or  perhaps  a  "chief  programmer",  is  in  the 
best  position  to  make  these  decisions.  This  thesis  will 
assume  that  these  decisions  are  made  by  the  operator  as  part 
of  the  system  generation  process.  As  mentioned  in  Chanter 
II,  the  global-vs-local  decisions  are  made  using  a  decision 
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Jut  the  decisions  as  to  the  specific  memory  locations  to 
allocate  for  each  segment  require  some  information  from  the 
programmer.  Specifically ,  the  programmer  must  provide  a  list 
of  the  segments  in  the  address  space  of  his  process,  the 
length  of  each  segment  (which  is  available  from  the  linker 
output),  and  whether  each  segment  is  writeahle  or 
non-vriteable .  The  identification  of  segments  must  he  unique 
across  all  processes  in  the  system  to  insure  that  shared 
segments  can  he  unambiguous ly  distinguished.  Figure  1 1 1 — 1 
shows  a  suggested  Process  Information  Form  which  mieht  he 
used  to  standardize  the  content  and  format  of  this 
information.  The  form  contains  one  entry  for  each  segment  in 
the  address  space,  and  indicates  which  of  the  above 
attributes  apply.  The  programmer  is  also  asked  to  identify 
which  other  processes  will  share  each  segment.  This  is  used 
only  to  cross  check  for  possible  design  errors  in 
interprocess  communication.  The  per-process  list  also 
includes  the  initial  parameters,  the  process  priority,  and 
processor  affinity  information  that  the  operator  needs  to 
build  the  Frocess  Definition  Table  used  by  the  bcotload 
program  and  the  run-time  loader  processes.  This  information 
form  is  provided  for  each  application  process  and 
(separately)  for  the  operating  system  kernel  for  each 
physical  processor.  The  kernel  includes  only  one  per-process 
data  segment:  the  kernel  stack.  Since  the  kernel  is  linked 
only  once  for  each  processor,  the  operator  must  "create"  the 
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PROCESS  INFORMATION  LIST 


PROCESS  NAME:  _  PRIORITY: _  AFFINITY: 

Initial  Parameters:  SS:_  _  AX:  RX: 


CX:  _  PX:  ES  : 


Process  Information  Form 
Figure  III — 1 
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corresponding  stack  for  each  process.  As  discussed  by  Wasson 
[*?],  the  kernel  stack  must  be  allocated  as  a  logical 
extension  of,  and  at  a  lower  address  than,  the  stack  segment 
for  each  process. 

Armed  with  this  process  information  list  and  the 
allocation  decision  matrix,  the  operator  is  now  prepared  to 
make  the  actual  allocations  of  specific  memory  locations  to 
segments.  Since  he  is,  in  effect,  the  Memory  Manager  process 
described  by  O'Connell  and  Richardson  [IS],  he  will  maintain 
the  System  Memory  fdaps,  for  both  local  and  global  RAW,  which 
reflect  the  status  of  the  system  memory.  As  shown  in  figure 
III-2,  the  memory  map  contains  the  base  address  and  length 
of  each  named  segment  and  the  base  address  of  the  free  or 
unallocated  areas  of  memory.  The  memory  map  is  completed  as 
a  sorted  list  to  aid  in  detecting  allocation  errors  made  by 
the  operator.  The  local  and  global  memory  in  the  system  is 
allocated  separately?  only  shared,  writeable  segments  are 
allocated  to  global  memory.  A  useful  guideline  is  to 
allocate  all  local  kernel  segments  at  addresses  below  the 
applications  so  that  applications  stacks  can  never 
"overflow"  into  the  kernel.  Recall  from  above  that  the 
operator  must  "add"  a  kernel  stack  segment  for  each  process. 
It  is  also  up  to  the  operator  to  avoid  "checkerboarding",  or 
fragmentation,  the  condition  in  which  many  small  free  areas 
exist  whose  combined  sizes  are  large  enough  to  contain  a 
segment,  but  none  are  large  enough  alone.  This  condition  can 
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usually  be  avoided  by  careful  allocation,  but  it  may  also 
involve  some  trl al-ard-errcr  to  obtain  a  proper  fit. 

5.  locating 

Once  all  the  allocations  decisions  have  been  made, 
the  actual  assignment  of  physical  memory  locations  is  made 
using  the  locator  utility  program,  L0CS6.  The  system 
operator  passes  the  allocation  decisions  made  for  each 
process  to  LOCee  as  parameters.  These  parameters  indicate  to 
the  locator  the  base  add  re  ss  of  each  segment ,  including  the 
kernel  stack,  in  the  process  address  space. 

The  operating  instructions  for  I0C86  contain  the 
options  and  parameters  required  to  control  memory  allocation 
[20].  The  output  from  the  locator  is  the  binary  core  image 
of  the  process  that  was  input  to  it.  This  image  is  complete 
with  load  addresses  for  the  code  and  data  in  the  process,  as 
well  as  the  CS  ar.d  ?S  register  values  necessary  to  start  the 
process  running .  The  locator  is  run  once  for  each 
application  process,  and  one3  per  CPU  to  locate  the 
distributed  operating  system  kernel  that  is  available 
through  the  Gate  to  all  processes. 

6.  File  Conversion 

As  discussed  above,  the  memory  management  function 
was  not  automated  due  to  the  lack  of  documentation 
concerning  binary  object  files.  For  the  same  reason,  the 
boot  load  program  ar.d  the  run-time  loader  processes  were 
des igned  to  read  the  ASCII  files  output  by  the  file 
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conversion  program,  0H86.  The  0H86  output  format  is  well 
documented  [22]  .  So  the  last  step  in  the  system  generation 
process  is  to  run  CF86,  once  per  located  process  and  CPU 
kernel,  to  transform  the  binary  object  file  into  the  ASCII 
format  expected  by  the  loading  processes.  A  skeletal  example 
of  the  output  produced  by  0H56  is  contained  in  Appendix  A. 

?.  System  Generation  Summary 

Before  proceeding  into  a  discussion  of  the  bootload 
phase  and  the  environment  in  which  the  bootload  program 
runs,  it  will  be  beneficial  to  pause  and  examine  exactly 
what  was  accomplished  during  system  generation,  and  exactly 
where  the  initialization  process  stands  when  system 
generation  has  been  completed.  This  thesis  views  system 
generation  as  a  time-sequence  of  events  that  begins  during 
program  design,  and  proceeds  through  compilation,  linking, 
memory  allocation,  locating,  and  file  conversion.  At  this 
point,  the  ASCII  representation  of  the  core  image  o^  each 
process  to  be  loaded  has  been  created  and  stored  as  a  file 
on  the  secondary  storage  (viz.,  floppy  disc)  in  the  MS.  The 
disc  also  contains  two  other  files:  the  bootstrap  program 
and  the  kernel  base  with  the  run-time  loader  process.  A 
graphic  representation  of  the  disc,  as  it  appears  at  the  end 
of  system  generation  tine,  is  shown  in  figure  III-3.  Mote 
that  for  each  process  the  loader  needs  the  disc  address 
(i.e.  track  number  and  sector  number)  of  the  target  file.  In 
the  MS-based  loader,  this  address  is  the  actual 


61 


300TSTP.AP  PROGRAM 
(hexadeci rral  ) 


KERNEL  EASE 
(hexadecirra  1 ) 


IDLE  PROCESS 
(hexadecira 1 ) 


LOADER  PROCESS 
(hexadecimal ) 


APPLICATION  PROCESS  =1 
(hexadecimal) 


APPLICATION  PROCESS  #2 
(hexadecimal ) 


rise  Contents  at  end  of  System  Generation 
Figure  1 1 1-3 
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filename , 


since  the  filename  is  used  by  the  I? IS  —II 


operating  system  disc  routines  on  the  MTS.  The  filename  is 
one  of  the  items  of  information  available  to  the  loader 
process  in  the  Process  Definition  Table. 

D.  THE  BOOTIOAD  PHASE 

When  it  is  desired  to  initialize  the  system  and  run  the 
application  programs,  the  bootload  phase  begins.  In  most 
computer  systems,  the  bootload  program  is  invoked  by 
activating  a  "reset"  or  "bootload"  switch.  This  causes  a 
Jump  to  the  first  instruction  of  the  bootlcad  program,  which 
is  contained  in  POM.  After  the  proposed  hardware 
enhancements  have  been  made,  and  the  complete  operating 
system  has  been  developed,  the  bootload  program  for  this 
system  will  be  placed  on  EPROM,  and  will  be  invoked  in  this 
same  manner.  This  section  will  discuss  the  sequence  of 
initialization  actions  that  take  place  upon  invoking  this 
P.OM-resident  bootload  program. 

like  system  generation,  the  bootload  phase  can  be  viewed 
as  a  time-sequence  of  activities,  beginning  when  the 
bootload  switch  is  pressed,  and  ending  when  the  operating 
system  kernel  is  running.  When  the  bcotload  switch  in  the 
multiple  microcomputer  system  is  depressed,  it  causes  a 
hardware  interrupt  to  occur  in  all  the  processors  in  the 
system.  The  interrupt  handler  for  the  bootload  interrupt  is 
the  POM-resident  bootload  program  in  each  processor. 


63 


1 .  Invoking  the  ROM-resident  Bootloader 

The  bootload  routine  is  a  small,  very  simple  program 
that  serves  three  basic  functions.  First  of  all,  it  rust 
determine  which  CPU  in  the  system  will  be  the  "Bootload 
CPU".  The  Bootload  CFU  will  serve  as  the  raster  or 
controlling  C0'1  throughout  the  hootload  and  run-time  loading 
phases.  While  the  bootload  programs  in  all  CPU's  are 
identical,  the  Bootload  CPU  will  execute  some  sequences  of 
instructions  that  the  „ther  processors  will  not.  When  the 
bootlcad  programs  begin  execution,  each  one  will  attempt  to 
read  the  same  variable  in  global  memory.  This  variable  will 
be  initialized  by  the  EPROM  programs  to  a  predetermined 
value.  As  mentioned  in  the  section  on  memory  allocation, 
access  to  global  memory  requires  that  a  processor  have 
exclusive  use  of  the  system  bus.  There  is  a  built-in  system 
bus  "lock"  that  can  be  set  as  soon  as  a  processor  gets 
control  of  the  bus.  This  lock  will  be  used  to  resolve  the 
conflict  of  multiple  simultaneous  access  attempts.  The 
processor  that  first  gets  control  of  the  bus  will  become  the 
Bootload  CPU.  This  processor  will  then  alter  the  value  of 
the  global  variable.  When  the  bus  lock  is  turned  off,  and 
other  processors  are  able,  in  turn,  to  access  the  variable, 
they  will  see  that  the  variable  has  been  altered,  and  enter 
a  wait  loop,  awaiting  further  instructions  from  the  Eootload 
CPU. 
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To  permit  the  programmer  to  specify  which  physical 


processor  he  wants  his  processes  to  run  on  (i.e.,  the 
affinity  of  the  process),  there  must  he  some  way  to  identify 
these  processors.  Physically,  the  processors  can  be 
identified  by  some  unique  serial  number  or  identification. 


number. 

This  tvpe  of  identification  is 

inconvenient 

for 

the 

operatir. 

g  system  because 

the  physical 

processo  rs 

can 

be 

removed 

and  replaced 

for  maintenance 

,  testing, 

and 

for 

various  other  reasons.  Therefore,  the  initialization  scheme 
needs  a  method  of  assigning  logical  CPU  numbers  to  the 
physical  processors  currently  in  the  system.  This  can  be 
done  in  a  manner  similar  to  determining  the  Boctlcad  CPU.  By 
convention,  this  scheme  assigns  logical  CPU  number  ?  to  the 
Bcotload  CPU.  The  Bootload  CPU  enters  its  serial  number, 
which  is  contained  in  its  PPPOM ,  into  the  first  entry  of  a 
global  structure  called  the  CPUtTABIP.  The  Bootload  CF'T  then 
sets  a  global  variable  called  IOC- 1 CAL^ CPU i NUT  equal  +  c  1, 
and  unlocks  the  lock  which  has  been  associated  with  that 
variable.  The  other  processors  will  cow  ’’race  tc  access 
LOGICALiCPUiNUM.  The  winner  of  the  race  will  set  the  lock, 
enter  Its  serial  number  into  the  second  entry  in  the 
C PU^TAB I F ,  increment  L0GICAL$CPTJ^NUV3EP ,  and  then  unlock  the 
lock.  This  process  will  continue  until  all  the  physical 
processors  hav*>  been  assigned  a  logical  CPU  number.  The 
Eootload  C?tt  will  know  how  many  CPU's  there  are  in  the 
configuration  and  that  all  processors  in  the  system  have 
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J  beer,  assigned  a  logical  number  after  some  fixed  time  period 

I 

|  (a  few  milliseconds)  has  elapsed.  In  addition  to  the  two  CFU 

t  numbers  in  the  C?TT  $TABL2 ,  each  processor  also  has  a 

! 

j  mailbox  :  a  location  used  for  a  primitive  method  of 

i  n  te  rprocessc  r  communication  with  tne  Bootlcad  CP'J. 

i 

;  2 .  Accomodating  the  Initial  hardware 

As  oreviously  discussed,  the  hardware  configuration 

i 

does  not  presently  include  online  secondary  storage,  ar.d  the 
|  decision  was  made  not  to  write  the  bcotload  into  EPP.CM  until 

|  the  development  was  complete.  Some  temporary  alterations 

|  were  made  in  the  initialization  mechanism  to  permit  the 

dev  1  opts n t  to  proceed  with  this  initial  hardware 
configuration.  The  use  of  the  NDS  to  simulate  secondary 
storage  was  mentioned  previously.  The  bootstrap  program 
reads  data  from  the  serial  port  of  one  of  the  ?6 /12A 
single-board  computers.  A  program  was  written  for  the  MTS 
that  reads  the  hexadecimal  object  files  from  floppy  disc  and 
outputs  the  hexadecimal  data  to  the  MDS  serial  port.  There 
is  a  cable  connecting  the  two  serial  ports.  The  cable  is 
made  to  allow  a  primitive  sort  of  protocol  between  the  two 
systems  via  the  "clear  to  send"  and  "request  to  send"  status 
lines  [22].  This  constrains  the  loading  funtion  to  having 
access  to  secondary  storage  from  only  one  processor,  rather 
than  from  any  processor  on  the  statem  bus.  Tc  simulate  the 
presence  of  an  EPROy  bootload  program,  the  ICE-P6  in-circuit 
emulator  was  used  to  load  the  bootload  program  into  RA^. 
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Using  the  dual-port  nerrcry  capability,  the  ICE-66  can  load 
the  bcotload  into  each  processor's  local  memory.  The  ICE-86 
was  else  used  to  alter  the  interrupt  vector  in  each  CPn  so 
that  the  preempt  interrupt  would  transfer  to  the  bootload 
program .  Finally,  the  processor  connected  to  the  WDS  was 
driver,  a  slightly  different  version  of  the  bootload  program 
that  starts  its  execution  by  sending  a  preempt  interrupt  to 
all  ether  processors,  simulating  the  boatload  switch. 

6.  loading  the  Bootstrap,  Program 

*-■  .  -  -  ■  .  - - 

vith  these  preliminaries  cut  of  the  way,  the 
Bootload  CPU  can  start  the  actual  bootstrap  loading 
function.  This  load  involves  the  first  access  to  disc  by  the 
initialisation  mechanism.  Since  the  boatload  program  is 
EFP.CU-resident,  simplicity  is  a  primary  concern.  Per  that 
reason,  the  bootload  program  will  merely  read  from  a  fixed 
address  on  disc,  and  load  the  data  into  a  fixed  area  cf 
global  memory.  Tor  the  same  reason,  only  the  Bcctlcad  CPU 
will  access  the  disc.  This  simplifies  the  bcotload  programs 
by  eliminating  the  need  f or  a  complex  synchronization  method 
to  allow  the  processors  tc  share  the  disc.  The  bootload 
program  on  the  Bootload  CPU  will  merely  read  a  single  disc 
record,  and  load  that  record  into  a  pre-spe ci f ied  global 
memory  buffer.  Note  that  this  disc  record  is  already  in 
executable  format  (viz.,  not  a  hexadecimal  file).  It  will 
then  transfer  control,  with  an  unconditional  jump,  to  the 
location  of  the  first  byte  in  the  buffer.  This  will  transfer 
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control  'rpn  the  EPP.OM  hootload  program  ir.  the  Bcctload  CPU 
to  the  bootstrap  program  just  read  in  from  disc.  Figure 
III —4  shows  the  contents  of  the  system  memory  after  the 
hootload  program  has  been  run. 

4 .  Executing  the  Bootstrap  Program 

The  block  of  data  just  read  in  contains  the 
bootstrap  prcvram  developed  during  system  veneration.  Peoall 
that  this  program  is  designed  to  load  the  base  layer 
(kernel)  of  the  opera  tins  system  from  disc  into  primary 
memory.  5ir.ce  each  processor's  local  memory  will  contain 
parts  of  this  kernel,  each  processor  will  need  to  execute 
the  vootstrap  program  to  lead  its  kernel.  For  simplicity, 
all  processors  will  share  the  same  bootstrap  program  cede, 
that  will  be  located  in  vlohal  HAM.  The  lootload  CPU  (at 
this  point  executing  the  bootstrap  program)  will  dc  the 
actual  disc  read  for  ail  processors.  This  is  consistent  with 
the  method  used  to  accomodate  the  the  initial  hardware 
configuration  as  discussed  above.  The  Bootload  CPU  will  load 
the  hexadecimal  file  containing  the  base  layer  of  the  kernel 
into  a  global  memory  buffer,  leaving  it  in  the  hexadecimal 
format.  The  Bootload  CPU,  since  it  is  already  running,  will 
then  be  the  first  processor  to  load  the  kernel  into  its 
local  memory.  The  bootstrap  program  includes  functions  to 
read  the  hexadecimal  object  file  (the  kernel)  from  the 
global  PAM  buffer,  convert  the  data  to  its  binary 
(executable)  representation. 


ar.d  load  it  at  the  addresses 


aHPMHMP 


LOCAL  MEMORY  a  1  .LOCAL  M?v0RY  *r. 


eprom 

EPROM 

ECOTLOAC 

E00TL0AL 

PROGRAM 

FPOGPAM 

PAM 

RAM 

Syster  Memory  at  end  of  Poctloa 
Figure  III-4 
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specified  in  the  hexadecimal  file.  Recall  that  this  load 
address  for  each  segment  in  the  kernel  is  made  up  of  the 
segment  base  address  in  the  segment  base  address  record  and 
the  load  address  offset  contained  in  the  data  record  itself. 

All  other  processors  are  still  executing  the  EPROV 
bcotload  program,  waiting  to  be  signalled  by  the  Eootload 
CPU  via  their  "mailboxes".  The  Eootload  CPU  now  signals  each 
CPU  in  turn  to  load  its  kernel,  and  then  waits  for  a  signal 
that  the  CPU  has  done  so.  Note  that  before  signalling,  the 
Eootload  CPU  insures  that  the  target  CPU's  kernel  is  in  the 
global  buffer-  either  read  in  from  disc  or  still  present 
from  the  loading  of  a  previous  CPU.  When  signaled  by  the 
Bcotload  CPU,  each  CPU  transfers  (jumps)  from  the  IPP.Oh! 
bootstrap  program  to  the  global  RAJ*  bootstrap  program.  It 
then  executes  the  routine  to  read  the  file  (the  kernel)  from 
the  buffer,  convert  the  da  ta  back  to  its  binary 
representation,  and  load  it  into  the  addresses  specified  ir. 
the  ASCII  file.  Since  the  identity  of  the  kernel  hexadecimal 
file  is  well  defined,  and  since  the  number  of  CFU's  is  known 
(viz.,  available  from  the  CPU  Table),  this  bootloac 
procedure  is  relatively  simple.  Recall  that  simplicity  is  a 
primary  goal  during  the  bootload  phase  since  the  environment 
is  only  the  bare  hardware.  As  each  processor  completes  its 
bootloading  task,  it  will  perform  an  unconditional  jump  to 
the  first  location  in  its  kernel  (now  in  executable  form). 
The  Eootload  CPU  will  Jump  to  the  kernel  after  all  ether 


CPU's  have  finished  their  bootloading  task  and  signalled 
this  fact  to  the  Bootload  CPU. 

This  jump  will  simulate  a  preempt  interrupt  in  the 
Inner  Traffic  Controller  interrupt  handler  [7],  The  jump  is 
to  a  special  entry  point  in  the  interrupt  handler  routine 
that  is  used  only  for  initialization.  This  entry  point  saves 
the  processor  register  values,  which  include  the  logical  and 
physical  CPU  numbers,  that  must  he  saved  for  later  use  by 
the  Inner  Traffic  Controller  Scheduler.  The  entry  into  the 
Inner  Traffic  Controller  marks  the  end  of  the  bootload  phase 
and  the  transition  into  the  run  time  phase.  At  this  point, 
all  processors  are  executing  in  the  kernel.  The  bootstrap 
program  is  no  longer  needed,  and  will  be  overwritten.  The 
system  memory  at  the  end  of  the  bootstrap  sequence  is 
configured  as  shown  in  figure  1 1 1 — 5 . 

E .  PUN  TI^E 

The  loading  performed  at  run  time  is  conceptually  quite 
similar  to  the  boot loading  discussed  in  the  previous 
section.  One  difference  between  the  two  phases  is  that  the 
run  time  loading  involves  all  processes  that  are  to  be  run. 
But  the  main  difference  is  that  the  "Bootload"  function  is 
done  by  run  time  loader  processes  that  run  on  the  virtual 
processor  provided  by  the  kernel.  This  implies  that  the 
instruction  set  now  includes  the  operating  system  primitives 
provided  by  the  kernel  (e.g.  ITC_ADVANCF,  ITC_AVAIT,  and 
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C rea te_?roces s ) .  This  provides  a  much  more  supportive 
environment  than  the  hare  hardware  of  the  hootload  phase. 

1 .  Invoking  the  loader  Processes 

To  understand  exactly  what  happens  when  the  hootload 
program  jumps  to  the  preempt  handler  in  the  kernel,  it  will 
he  beneficial  to  review  just  what  is  in  the  kernel  base,  and 
how  the  contents  of  the  kernel  go  about  performing  the 
remainder  of  the  loading  activities. 

There  are  actually  two  processes  in  the  kernel  base. 
The  first  is  the  idle  virtual  processor.  Recall  that  this 
"processor"  is  invoked  when  there  is  no  other  useful  work 
available  to  be  run  on  a  physical  processor.  The  other 
kernel  process  is  the  run  time  loader  process-  just  a 
modified  version  of  the  O'Connell  and  Richardson  memory 
manager  process  [l£]  .  All  kernel  segments  are  included  in 
the  address  space  of  both  these  kernel  processes. 

The  Virtual  Processor  h'ap  (V?M)  in  the  Inner  Traffic 
Controller  was  initialized  during  system  design  to  reflect 
that  the  idle  virtual  processor  is  "running"  or  each  CPU. 
The  memory  manager  (i.e.  loader)  is  initialized  in  the  ready 
state  and  with  a  high  priority.  All  other  virtual  processors 
are  in  the  idle  state. 

The  Traffic  Controller's  Active  Frocess  Table  (APT) 
is  initialized  with  NO  applications  processes.  All  virtual 
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processors  visible  to  the  Traffic  Controller  are  shown  to  be 
running  an  idle  process. 


Because  of  this  initial  state  created  during  system 
generation,  the  jump  to  the  Inner  Traffic  Controller  at  the 
end  of  the  tootload  phase  appears  to  the  kernel  as  a  preempt 
interrupt  of  the  idle  virtual  processor.  This  preempt  causes 
the  higher  priority  loader  process  to  be  scheduled  and  run 
on  each  physical  processor. 

These  loader  processes  all  have  the  Process 
Definition  Table  in  their  address  space  as  an  external  data 
segment  shared  by  all  loader  processes.  This  table  is  the 
primary  data  base  used  to  drive  the  remainder  of  the  loading 
function. 

2 .  leading  the  Application  Processes 

Now  that  the  operating  system  kernel  is  running  on 
each  physical  processor,  it  can  be  used  to  load  the 
application  processes  from  disc.  Since  each  application 
process  exists  as  a  hexadecimal  object  file  on  the  disc,  and 
since  the  loader  processes  have  a  complete  description  of 
each  application  process  in  their  address  spaces  (viz.,  the 
Process  Definition  Table),  the  remainder  of  the  leading 
tasks  are  relatively  s t raightf orwa rd .  This  will  involve 
reading  each  application  process  from  the  disc,  placing  it, 
in  executable  (i.e.  binary),  form  at  the  appropriate 
location  in  the  system  memory,  identifying  the  process  to 
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the  kernel ,  and  finally,  causing  the  kernel  to  schedule  and 
execute  the  application  processes. 

The  3ootload  CPTJ  still  serves  as  the  system  master,  ar.d 
still  makes  all  disc  I/O  requests.  Since  their  register 
values,  including-  their  serial  numbers  and  logical  CPU 
numbers,  were  passed  to  them  at  the  beginning  of  run  time, 
each  processor  can  determine  whether  or  not  it  is  the 
Bootload  CPU  (i.e.,  is  its  logical  CPU  number  ?? '  .  If  it  is 
not,  the  loader  process  will  dc  an  ITC_A'VAIT,  until  it  is 
signalled  to  proceed  (via  an  ITC_AE7ANCP)  by  the  Pootload 
CPU.  The  sequence  of  operations  performed  at  run  time  call 
for  the  Pootload  CPU  to  read  the  first  non-kernel 
hexadecimal  object  file  from  the  disc  and  to  store  it  in  the 
global  FAW  buffer.  The  Pootload  CPU  then  checks  the  Affinity 
in  the  Process  Definition  Table  to  determine  which  physical 
processor  the  process  is  intended  to  run  on.  It  will  then  do 
an  ITC_AI7ANCP  on  the  appropriate  eventcount  for  the  loader 
process  in  that  CPU.  Note  that  there  is  the  special  case  cf 
application  processes  being  loaded  on  the  Pootload  CPU.  In 
this  casp,  the  signalling  will  be  slightly  different.  Put 
this  will  require  only  a  minor  addition  to  the  loader 
program . 

The  designated  processor's  loader  process  will  load 
and  convert  the  hexadecimal  object  file  as  described  in  the 
previous  section.  In  addition,  it  will  extract  from  the 
hexadecimal  object  file  the  CS  and  IP  register  values.  It 
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will  enter  these  values  into  this  loader  process's  Process 
Parameter  Elock,  along  with  the  SS  register  value  Prop-  the 
Process  Definition  Table.  The  loader  process  then  calls  the 
kernel  Traffic  Controller  procedure  ”create_?rocess", 
passing  the  address  of  the  Process  Parameter  Elock  as  an 
argument.  Crea te_Froces s  makes  the  necessary  entries  in  the 


Active  Process 

Tabl  e 

to  describe 

the 

just-1 oaded 

process , 

and  initial!* 

es 

the  kernel 

stack 

for  this 

process . 

Create_Process 

then 

returns  into 

the 

loader  process  from 

which  it  was  called.  The  loader  process  will,  in  turn, 
notify  the  Eootload  CFU  that  it  has  finished,  and  the 
Bootload  C?7  will  read  in  the  hexadecimal  object  file  for 
another  process. 

3  *  Initiating  Application  Process  Execution 

This  sequence  of  events  is  repeated  until  the  loader 
process  on  the  3ootload  CP'J  finds  a  null  entry  in  the 
Process  Eefinition  Table,  which  signifies  that  all  processes 
hav°  been  loaded  and  created.  This  means  that  all  system 
initialization  functions-  system  generation,  bootloading, 
and  run-time  loading-  ere  completed,  and  all  application 
processes  are  created,  loaded  on  their  respective 
proc°ssors,  and  in  the  ready  state.  The  only  thing  required 
now  is  for  the  Bootload  CP'J  to  call  the  ITC_SST_P?BFy?T 
procedure  for  each  virtual  processor  known  to  the  Traffic 
Controller  and  then  do  an  ITC  A. WAIT.  This  will  cause  the 
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normal  scheduling  functions  to  run  the  highest  priority 
process  that  is  ready  to  be  run  or.  each  processor. 

f. 

In  this  chapter,  the  entire  sequence  of  events  required 
for  initialization  of  a  multiple  microcomputer  system  have 
been  examined.  Each  of  the  initialization  phases  -  system 
generation,  hootloading ,  and  run  time  -  and  the  environments 
in  which  they  occur,  have  teen  analyzed.  This  analysis  was 
intended  to  show  the  reader  how  initialization  can,  indeed, 
be  simplified  hy  a  careful  sequencing  of  initialization 
activities. 
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17.  SUGARY  AND  CONCLUSIONS 

A.  ST7*MA?.T 

The  goa 1  of  this  thesis  has  been  to  develop  a  system 
initialization  mechanism  for  the  Intel  £e£6-based  multiple 
microcomputer  system  to  he  used  hy  the  Solid  State 
Laboratory  at  the  Naval  Postgraduate  School  for  smart 
sensor"  r°search.  A  secondary  goal,  frcm  the  cutset,  has 
teen  to  present  a  system  initialization  design  philosophy 
that  would  help  fill  a  void  in  current  computer  science 
literature.  This  design  philosophy  asserts  that  the  issues 
of  system  generation  and  bootstrap  loading  deserve  a  level 
of  consideration  equal  to,  and  concurrent  with,  operating 
system  issues.  The  basic  premise  of  the  thesis  is  that 
simplification  leads  to  a  mere  versatile  and  robust  design 
and,  subsequently,  to  a  system  initialization  mechanism  that 
is  easily  understood  and  readily  adaptable  tc  a  variety  of 
hardware  and  operating  system  configurations. 

The  simplification  in  this  design  approach  is  achieved 
by  two  means.  The  first  is  a  core-image  driven  leader.  This 
technique  involves  creating  a  copy  of  the  base  layer  of  the 
operating  system  as  it  should  appear  in  primary  memory 
immediately  prior  to  execution.  This  core  image  is  tr.en 
stored  on  some  secondary  storage  medium.  When  it  is  desired 
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to  initialize  the  system,  this  core  image  is  merely  leaded 
into  primary  memory  and  control  is  passed  to  the  first 
instruction. 


The  other. 

and 

probably  more 

meaningful,  means 

of 

simplification 

is 

to 

careful 1 y 

sequence  th®  requi 

red 

initialization 

activi ties 

such  that 

each  is  performed  in 

t  he 

most  supportive  environment  available.  This  transfers 
functional  complexity  to  a  phase  of  initialization  that 
enjoys  the  most  operating  system  and  utility  program 
support,  and  removes  possible  complexity  from  the  hare 
hardware  environment  of  the  bootload  program.  Since  the  most 
supportive  environment  in  this  application  is  available  at 
system  generation  time,  the  goal  was  to  accomplish  as  many 
initialization  activities  as  possible  during  this  phase. 
With  the  assumptions  (based  on  the  application  for  which  the 
system  was  designed)  made  at  system  generation  time,  this 


thesis  was 

a  bl  e 

to 

fully 

exploit  this  most  s 

u  ppor t i ve 

envi ronmen  t 

.  I  n 

so 

doing. 

the  generation  of  the 

comcle te 

core  image 

and 

all 

memory 

allocation  were  see 

crpl is  bed 

during  system  generation.  As  the  core  image  of  each  process 
is  created,  the  identity  of  the  process  (viz.,  its  address 
space  and  execution  point)  were  encoded  into  the  image.  Thus 
every  process  in  the  system  could  be  completely 
characterized  with  information  contained  in  its  core  image. 
This  capability  creates  a  compilation-independence  that  is 
important  to  a  general  purpose  initialization  mechanism. 


79 


The  system  initialization  scheme  designed  for  this 
thesis  makes  extensive  use  of  the  operating  system  kernel 
primitives  avaiable  at  run  time.  In  particular,  the 
ITC_ADVANCE  and  ITC_AVAIT  primitives  are  used  for 
interprocess  communication  during  the  loading  of  the 
application  processes,  and  the  Create_Frocess  function  is 
used  to  identify  the  application  processes  to  the  kernel. 

B.  FOLLOW-ON  WORK 

This  thesis  has  scratched  the  surface  of  an  extremely 
interesting  and  challenging  research  area.  But  in  developing 
the  initialization  mechanism  discussed  here,  it  brought  to 
light  many  follow-on  research  ideas.  Naturally.  the  first 
follow-on  work  should  concentrate  on  completing  the 
implementation  of  the  design  presented  in  this  thesis.  mhe 
design  and  implementation  should  then  be  extended  to 
automate  as  many  of  the  manual  functions  as  possible.  This 
should  include  complete  automation  of  the  linking  and 
locating  processes,  possible  elimination  of  the  file 
conversion  program,  and  automated  memory  allocation  as 
discussed  by  O'Connell  and  Richardson  [18].  This  would 
provide  programmatic  creation  of  the  Frocess  Definition 
Table,  initial  memory  map,  and  the  other  system 
initialization  data  structures.  This  effort  will  require 
additional  documentation  from  the  Intel  Corporation  on  the 
development  tools  and  file  formats  discussed  in  Chapter  II. 
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Recall  that  this  thesis  made  several  assumptions  to 
simplify  and  expedite  the  development  process.  Near-term 
research  efforts  might  attempt  to  eliminate  some  of  these 
assumptions,  particularly  those  about  the  static  nature  of 
the  run-time  environment.  This  would  result  in  a  more 
generally  applicable  mechanism  that  would  be  less  dependent 
on  a  priori  knowledge  about  the  system  configuration.  In 
order  to  achieve  this  generality,  it  will  be  necessary  to 
automate  most  of  the  functions  that  are  done  manually  in 
this  thesis,  particularly  the  memory  allocation.  The  design 
of  this  initialization  mechanism  is  compatible  with  the 
memory  allocation  scheme  designed  by  O'Connell  end 
Richardson,  and  should  accept  such  a  run-time  memory 
allocation  function  without  major  alterations. 

Of  immediate  concern  to  the  smart  sensor  research 
project  should  be  the  integration  of  the  hard  disc  subsystem 
into  the  hardware  conf igu ra t i on .  The  availability  of  on-line 
secondary  storage  would  permit  further  simplification  of  the 
initialization  mechanism,  and  remove  the  need  for  the 
"controlling  loader". 

The  most  challenging  research  area,  however,  is  dynamic 
reconf iguration  and  its  subsequent  benefit-  fault  tolerance. 
These  are  state-of-the-art  issues  that  are  also  long  term 
goals  of  the  smart  sensor  program.  They  are  also  almost 
mandatory  for  a  viable,  operational  smart  sensor  platform. 


C.  CONCLUSIONS 


The  wor’f  done  in  this  thesis  has  shown  the  feasibility 
of  developing  a  simple,  versatile  system  initialization 
mechanism  based  on  a  core  image  approach  and  the  careful 
sequencing  of  initialization  activities.  The  design  proposed 
in  this  thesis  has  net  been  fully  tested,  but  sufficient 
functions  were  implemented  to  support  the  basic  concepts 
proposed.  The  experience  with  tha  system  thus  far  has  shown 
that  the  concepts  are  not  difficult  to  put  into  practice, 
and  that  they  do  result  in  a  simple,  easy  to  understand 
mechanism  for  loading  and  starting  a  process  on  a  bare 
machine.  The  design  proposals  developed  in  this  thesis 
should  prove  beneficial  to  future  initialization  development 
efforts,  even  where  the  hardware  and  operating  system  are 
different. 

The  thesis  has  also  confirmed  the  value  of  an  operating 
system  with  explicit  se?mDnts  and  processes,  ar.d  has  shown 
how  such  an  operating  system  structure  can  be  exploited  to 
significantly  simplify  the  initialization  mechanism.  As  this 
structure  for  microcomputer  operating  systems  become^  mere 
widely  implemented,  the  methods  used  in  this  thesis  can  be 
widely  apolied  tc  simplify  the  entire  system  ir. itialization 
process  . 
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APPENDIX  A.  UTILITY  PROGRAM  OUTPUT 


1  A.  05JSCTIVSS 

This  appendix  is  provided  to  further  acquaint  the  reader 
with  the  Intel  software  development  utility  programs  used  in 
this  thesis.  Each  program  and  its  pertinent  parameters  and 
options  will  he  explained,  and  a  sample  output  will  he 
provided.  While  these  programs  are  Intel  products,  and  are 
designed  specifically  for  the  Intellec  MES  with  the  IS  I T  —  1 1 
operating  system,  they  are  represen  tat  ive  of  prcgrars 
provided  with  other  computer  systems.  The  sample  outputs  at 
the  end  of  this  appendix  are  based  on  a  very  simple  ?L/v-66 
program,  written  only  to  demonstrate  the  development  utility 
programs.  The  source  code  for  the  sample  program  is  shown  in 
figure  A-l . 

3 .  THE  PI/m-56  COMPILER 

As  mentioned  in  Chapter  II,  the  Fl/v-86  Compiler 

[  translates  the  PI /M-S6  source  statements  into  8?S 6  machine 

1 

[  instructions.  The  MODE  control  in  the  command  line 

determines  the  degree  of  segmentation.  In  the  sample  program 
compilation  in  figure  A-2 ,  the  CODS  control  was  used  to 
cause  the  compiler  to  list  the  8286  machine  code 
instructions  generated  for  each  FL/M-8S  instruction.  Not? 


83 


that  the  lengths  of  all  the  segments  produced  by  the 
compiler  are  listed  at  the  end  of  the  output. 

C.  THE  LIMKSe  ??.0GF At* 

The  linker  program,  as  discussed  in  Chapter  II,  combines 
the  various  program  modules  that  make  up  a  process  and 
resolves  any  external  references.  At  the  sare  tire,  it 
adjusts  the  relative  addresses  in  the  module  sc  that  they 
are  all  relative  to  the  beginning  of  the  output  module.  The 
sample  LINE??  output  listing  in  figure  A-2  shows  the  list  of 
segments  produced  for  the  sample  program  by  the  Intel 
1 i nke  r. 

D.  THE  I0C66  P^OGPAP 

The  locator  program  is  used  to  assign  physical  memory 
addresses  to  the  relative  addresses  in  the  linker  output 
module.  I0C96  provides  several  diagnostic  and  cutput  format 
controls  [2C]  .  Diagnostic  information  includes  a  symbol 
table  and  a  complete  memory  map,  showing  the  results  of  the 
locator  function.  This  information  is  sent  to  a  printable 
disc  file  unless  otherwise  specified.  Cutput  module  controls 
are  used  to  control  the  content  o  **  the  output  module,  the 
order  of  the  segments  in  the  module,  and  the  assignment  of 
physical  m°mory  locations  to  the  segments.  The  controls  cf 
primary  concern  here  are  the  ADDP.SPSSS  and  SEGMENTS 
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controls.  As  seer,  in  figure  A-4,  these  controls  assign  a 
base  address  to  each  segment  in  the  process. 

The  other  control  of  interest  during  system 
initialization  is  the  SEGSIZE  control.  It  is  used  to  specify 
th*3  size  of  one  or  more  segments  in  the  output  module.  This 
control  is  used  during  system  generation  to  build  the  kernel 
stack  frame  discussed  in  Chapter  III. 

The  sample  L0C66  output  in  figure  A-4  includes  the 
process's  symbol  table  and  memory  map.  Tor  illustrative 
purposes,  the  SEGSIZE  control  was  used  to  add  2ZH  bytes  tc 
the  size  of  the  stack  segment. 

E.  THE  0HS6  PPOGSAM 

The  final  utility  program  used  during  system  generation 
is  the  file  conversion  program,  0HS6 .  P.ecal  1  that  this 
program  translates  the  binary  object  file  (for  which  very 
little  documentation  is  available)  into  an  ASCII  hexadecimal 
object  file  (which  is  very  well  documented).  The  sample 
output  from  0FS6  is  shown  in  figure  A-5.  The  blank  spaces 
and  lice  numbers  were  added  to  improve  readability,  and  do 
not  occur  in  the  actual  output  file. 

Each  hexadecimal  file  produced  by  0E66  is  made  up  of 
four  different  record  types.  These  record  types  are 
explained  below. 
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1.  Record  Type  00  is  the  Data  Record.  These  records 
contain  the  actual  program  code  and  data  that  make  up  each 
process . 

2.  Record  Type  01  is  the  End-of-File  Record. 

3.  Record  Type  02  is  the  Extended  Address  Record.  This 
record  specifies  the  segment  base  address  for  the  type  00 
records  that  immediately  follow  it.  For  example,  the  type  02 
record  in  line  11  of  figure  A-5  contains  the  segment  base 
address  (0100H)  for  the  type  00  records  in  lines  12  through 
18. 

4.  Record  Type  03  is  the  Start  Address  Record.  It 
specifies  the  Code  Segment  and  Instruction  Pointer  register 
values  for  the  first  instruction  in  the  Code  Segment.  In  the 
example,  the  CS  register  value  is  01 00H ,  and  the  IP  register 
value  is  0006K.  The  locations  from  the  address  specified  in 
L0CS6  ( 1000F )  to  the  address  specified  in  the  Start  Address 
Record  (100SH)  are  used  by  the  compiler  to  store  the 
addresses  of  external  data  segments,  and  the  DS  and  S? 
register  values  (see  lines  01  through  04). 

Each  of  the  records  in  the  hexadecimal  object  file 
consists  of  several  fields.  These  fields,  and  their  effect 
on  the  loading  function  is  explained  below. 

1.  The  Record  Mark  Field  is  used  as  a  record  delimiter. 
0H86  uses  an  ASCII  colon  (03AF.)  to  signify  the  beginning  of 
each  record. 
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2.  The  Record  Length  Field  contain*-  two  ASCII  dibits 
that  specify  the  length,  in  bytes,  of  the  data  or 
information  contained  in  the  record. 

3.  The  load  Address  Field  contains  the  address  offset 
from  the  segment  base  address  (in  the  type  £2  record)  for 
the  first  data  byte  in  the  record.  Note  that  only  type  PC 
records  have  load  addresses  other  than  C22P.  Recall  from 
Chapter  II  that  there  is  no  boundary  check  made  when 
addressing  into  a  segment.  The  exact  load  address  for  a 
particular  data  byte  can  be  calculated  as  follows: 

EFF.  ADDR.  -  EASE  ADDRESS  -  [(ERL A  +  ERI )  MODULO  64K] 

Where  DRLA  is  the  Data  Record  Load  Address,  and  D-I  is  the 
byte  index  within  the  Data  Record. 

4 .  The  Record  Type  Field  specifies  the  type  of  the 
record,  as  described  above, 

5 .  The  Data  Field  contains  the  actual  data  to  be 
converted  to  binary  and  loaded  into  primary  memory.  This  is 
a  variable  length  field  that  may  be  from  2  to  ICR  bytes 
long . 

6.  The  Checksum  Field  is  used  for  error  detection  in  the 
loading  and  translating  process.  It  contains  the  twos 
complement  of  the  8-bit  sum  of  the  bytes  that  result  from 
converting  the  ASCII  bytes  back  into  binary. 
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summary 


This  appendix  was  intended  to  acquaint  the  user  with 
more  details  concerning  the  software  development  utility 
programs  used  to  develop  the  system  generation  mechanism 
described  in  this  thesis.  It  has  provided  a  very  simple 
PI/M-86  program  and  the  output  from  each  of  the  development 
utilities.  The  reader  desiring  additional  information  about 
these  programs  should  refer  to  the  MCS-R6  Software 
Development  Utilities  Operating  Instructions  for  I S I S  —  1 1 
Users  [20] . 


SOURCE  LISTING 


ajt  Jjs  *#>!«  Jfc  £  # £  sj:  # ^  Jjt ijs  jj:  j):  s)t  $  *  #  5js  #  $  jj<  # »Jt  sjt  #  # #  sjs  >!<  £  jjr  5*  ijt  if.  if *#:!:*)(!*#  #  if  if if  # 

/* 

/*  Sample  Program  to  demonstrate  the  software 

/*  development  utility  programs  used  during 

/*  system  generation.  This  program  simply 

/*  increments  a  global  array  element  nine 

/*  times  and  then  prints  the  result  on  the 

/*  terminal  screen. 

/* 


/* 


sjt  #  j(:  #  #  j',:  i):  jJ; 


»c*  *1*  V  V  *1"  ■ 


•  y.  •*-  *»»  *»«  .i,  . 

•  irv  v  *»*  *i*  'i*  », 


*  V  V  *(•  V  »t*  V  V  *|t  *.»  V  *1*  'i*  V  V  V 


»•»  .1.  * 


V  ^  V  V 


*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 

*/ 


CCUNTEP.l  :  DOJ 

DECLARE  I  BYTE,  /*loop  index*/ 

ARRAY  ( ?. )  BYTE  EXTERNAL,  /-‘external  array*/ 

PROMFT( * )  BYTE  INITIAL (  'VALUE  IS:  '), 
STATUSPORT  LI TER ALL v  '0DSH', 

DATAPORT  LITERALLY  '0EAH  '  , 

XMITRDY  LITERALLY  'eZlB'? 


#:}c  £  $  *  #  sf:  £  #  $  Jjs**:*  #####«)(«  »Jca(e  ft:*####*  f 

*/ 
*/ 
*/ 


/** 

/* 

/* 

/* 

/* 

/* 

/* 

/* 

/  $  $  $  4  4  $  $  9jt  $  y,<  $  $  #](!  $  $  sjc  $  $  ft  $  $  4  *  *  *  ft  *  *  *  *  *  *  =! 


OUTCEAR  is  a  procedure  which  tests  the 
status  of  the  serial  I/O  port  that  is 
connecteditto  the  terminal.  If  the  port 
is  "ready",  an  ASCII  character  is  output 
to  the  CRT  screen. 


:  :|s  ^ 


*/ 

*/ 

*/ 

*/ 

’*/ 


OUTCHAR:  PROCEDURE  (CHAR); 

DECLARE  CHAR  BYTE; 

DO  WHILE  (INPUT (STATUS PORT)  AND  XYITRIY)  =  e; 

END;  /*  wait  until  ready  to  transmit*/ 
OUTPUT (DATAPORT)  =  CHAR  AND  07FHJ 
end;  /*  of  OUTCHAR  declaration  */ 


ARRAV (0 )  =  0;  /*  initialize  sun  */ 


PL/M-56  Source  Listing 
Figure  A-l 
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DO  I  *  0  TO  9J  . 

/*  increment  the  sum  */ 

AP5AT(0)  =  A?RAT ( £ )  ♦  I J 
END;  /*  of  DO  loop  v 

DO  I  =  0  TO  LAST  .(PROMPT); 

/*  print  the  ’header"  * / 

CALL  OUTCEAR  (PP-C^FT  ( I  ) )  > 

END;  /*  of  print  loop  */ 

CALL  OUTC HA R( ARRAY (  0  )  )  J  /*  print  the  sum  */ 
END?  /*  COUNTERl  program  #/ 


PL/P-S6  Source  Listing 
Figure  A-l  (cont'd) 
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PLM-86  COMPILER 


COUNTERl 


ISIS-II  PI  /V— 86  71.2  COMPILATION  OF  MODULE  CO'JNTERl 
OBJECT  MODULE  PLACED  IN  :P 1 : CNTR1 . OBJ 
COMPILER  INVOKED  BT :  PLMS6  :F1 :CNTRl .SRC  CODE  LARGE 
DATE(1  JUNE  60  j 


^sfcsjcsjssjc sicjJcjJt  #  ##  if.  #  y 


/*  */ 

/*  Sample  Program  to  demonstrate  the  software  */ 

/*  development  utility  programs  used  during  */ 

/*  system  generation.  This  program  simply  ■■'■/ 

/*  increments  a  global  array  element  nine  # / 

/*  times  and  then  prints  the  result  or.  the  * / 

/*  terminal  screen.  */ 

/*  */ 


«  <T»  V  V  V  < 


•  *'•  •*.  **»  t1*  »•»  •>*  **•  »'»  \.„  *l»  »'«  v*f  »•*  .1.  \ 

•  Vv  v  v  v  'I*  »i»  v  •*»  v  'i*  *»*  *i*  v  *i»  v  » 


♦  •,*  »«•  'r-  v  •, 


•i.  ...  v-  +>*  ) 

- «' *  *r>  / 


1  COUNTERl:  DO  5 

2  1  DECLARE  I  BYTE,  /*  loon  index  */ 

ARRAY (2)  BYTE  EXTERNAL, 

PROMPT (* )  BITE  INITIAL (  'VALUE  IS: 
STATUSPORT  LITERALLY  '0L8E  '  , 

DATAPORT  IITERALLV  '2DAH  ' , 

XMITRDY  LITERALLY 


/*  / 
/*  OUTCHAH  is  a  procedure  which  tests  the  */ 

/*  status  of  the  serial  I/O  port  that  is  */ 

/*  connectedf< to  the  terminal.  If  the  port  */ 

/*  is  "ready",  an  ASCII  character  is  output  */ 

/*  to  the  CRT  screen.  */ 

/*  V 


3  1  OUTCHAF.:  PROCEDURE  (CHAR),* 

;  STATEMENT  #  3 


PL/M-66  Compiler  Listing 
Figure  A- 2 
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4  2 

5  2 


7  2 


S  2 


9  1 


le  l 


OUTCHAR  PROC  NEAR 

0073  55  PUSH  B? 

0074  8BSC  MOV  3? ,  S  P 


DECLARE  CHAR  BYTE  J 

DO  WHILE  ( INPUT(STATUSPOP.T)  AND  XMITRDY)  =  0 i 

end; 

;  STAmEMENT  sr  5 


01: 


0076 

E4DS 

IN 

0D8H 

0078 

F6C001 

rp  ^  q  rr 

AL  ,  1H 

007E 

7403 

JZ 

S  +  5F. 

007D 

E90300 

JM? 

02 

;  STATEMENT  *  6 

0080 

E9P3EE 

02: 

JM  P 

01 

OUTPUT (DATAPORT)  = 

CHAP.  AND  07FK; 

;  STATEMENT  *  7 

0083 

3  A  46 04 

MOV 

AL, [BP] .CHAR 

0086 

80E07E 

AND 

AL,7FH 

0289 

E6DA 

end; 

OUT 

0DAE 

;  STATEMENT  t>  8 

008E 

5D 

POP 

BP 

008C 

C  20200 

0UTCHAR 

RET 

2H 

ENDP 

ARRAY(05  =  0; 

/* 

initialize  sum  */ 

;  STATEMENT  *  9 

0008 

FA 

CL  I 

SS ,CS :OOSTACX$FPAME 

0009 

2E8E160400 

mov 

00  0E 

EC0600 

MOV 

SP,00STACV^0FFSET 

0011 

8BEC 

MOV 

B?,SP 

0013 

2E8E1E 0600 

MOV 

DS,CS:OODATA$F?.AME 

0018 

FB 

STI 

e0i9 

2EC4 1E0000 

LES 

BX ,CS : 0 ARE AY 

001E 

26C6e700 

DO  I  =  0  TO  9 

v0'r 

; 

ES:AP.PAV  [3X1  ,0H 

;  STATEMENT  #  10 

0022 

C606000000 

03: 

MOV 

I  ,  0H 

00  27 

803E000009 

CMP 

I  ,9H 

0e2C 

7603 

JBS 

£+5K 

e02E 

E91100 

;mp 

04 

PL/M-86  Compiler  Listing 


Figure  A-2  ( cont 'd ) 
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11 

2 

ARRAY (0) 

=  ARRAV( 0 

)  +  i; 

/*  inc 

remen t  sum 

*/ 

J  STATEMENT  * 

11 

0031 

2EC41F0000 

LES 

EX  ,  CS :0ARRAY 

0036 

26FF07 

INC 

ES : ARRAY  [BX] 

12 

2 

END;  /*  DO 

LOOP  */ 

J  STATEMENT  * 

12 

00  39 

FE060000 

INC 

I 

00  3D 

7403 

JZ 

i*5H 

00  3F 

E9E5FF 

JMP 

03 

04 : 

13 

1 

DO  I  =  0  TO 

LAST (PROMPT )  I 

/**  print  the  header 

*/ 

J  STATEMENT  # 

13 

0042 

C606000000 

MOV 

I.0H 

05: 

0047 

S23E  00000A 

CM? 

I  ,  0AH 

004C 

7603 

JBE 

£  +  5R 

004E 

E91500 

jvp 

06 

14 

2 

CALL  CUTCRAR ( PROMPT 

(D); 

J  STATEMENT  a 

14 

0051 

8A1 E0000 

MOV 

EL  .1 

0055 

B700 

MOV 

BR.0H 

e0  57 

FF7701 

PUSH 

PROMPT [EX];  1 

00  5A 

ES 1600 

CALL 

OUTCHAF. 

15 

2 

end; 

;  STATEMENT  « 

15 

00  5D 

FS060000 

INC 

I 

0061 

7403 

JZ 

$-5H 

0063 

E9E1FF 

JMP 

05 

06: 

16 

1 

CALL  CUTCHAR (ARRAY (0 ) ) 

5 

/*  print 

the  sum  */ 

;  STATEMENT  a 

i.  o 

0066 

2EC41E0000 

LES 

3X.CS  :-?A?RAv 

006B 

26FF37 

PUSR 

ES  :  AH  PAT  [EX] 

006F 

ES  0200 

CALL 

OUTCHAF. 

1? 


end;  /*  C0UNTSP.1  */ 

0071  F3  STI 

0072  F4  RLT 


STATEMENT  a  17 


PL/M-85  Compiler  listing 
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MODULE  INFORMATION: 

CODE  AREA  SIZE  =  00SFE  143D 

CONSTANT  AREA  SIZE  =  0000 F  0D 

VARIABLE  AREA  SIZE  =  000CH  12D 

MAXIMUM  STACK  SIZE  =  0006H  6D 

52  LINES  READ 
0  PROGRAM  ERROR (S) 

END  OF  PL/M-86  COMPILATION 


FL/M-86  Compiler  Listing 
Figure  A -2  ( cont 'd ) 

94 


A 


LINKSS  LISTING 


IS  IS -I  I  MCS-Go  LINKER,  VI. 1,  INVOKED  21: 

LINKS6  :F1 :CNTR1 .02J,  : Fl  : ARRAY . 02 J  TO  :F  1  :  CNTR 1 .  INK 
LINK  MAP  FOP.  :  Fl  :  CNTR1 .  LNK(  COUNTER!  ) 


LOGICAL  SEGMENTS  INCLUDED: 


length  address 

SEGMENT 

CLASS 

00SFH  - 

C0UNTEH1 

CODE 

CODE 

eeecH  - 

COUNT ER1 

DATA 

DATA 

0006H  - 

STACK 

STACK 

0000R  - 

MEMORY 

MEMORY 

0000H  - 

ARRAY DEC 

CODE 

CODE 

0002H  - 

APRAYDEC 

DATA 

DATA 

INPUT  MODULES  INCLUDED: 
:F1 : CNTR 1 .02: (COUNTER  1) 
:F1 : AREAV  .02J  (APRAYDEC  ) 


LINK86  Listine 
Figure  A-2 
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NAVAL  POSTGRADUATE  SCHOOL  MONTEREY  CA  F/G  9/a 

DESIGN  OF  A  SYSTEM  INITIALIZATION  MECHANISM  FOR  A  MULTIPLE  MICR-- EtC(U> 
l(UN  80  J  L  ROSS 


IS  IS -I I  MCS-86  LOCATES,  71.1  INVOKED  BT : 

L0C86  :F1  :CNTR1  .INK  TO  :F1 :CN?R1 .RUN  ADDRESSES (SEGMENTS^ 
(COUNTER1  CODE(1000E) .COUNTSR1  DATA(2000H) ,  STACK (300 0R )  ,& 
ARRAYDEC  DATA (3e£00E ) , ARRAYDEC'CODE (31000R ) , & 

MEMORY (311 00 H ) )  )& 

SEGSIZE(STACK (+20E) )  RS(0  TO  0FFFH ) 

SYMBOL  TABLE  OF  MODULE  COUNTER1 
READ  FROM  FILE  :Fl:CNTRl .LNK 
WRITTEN  TO  FILE  :F1 :CNTR1 .RUN 

BASE  OFFSET  TYPE  SYMBOL  BASE  OFFSET  TYPE  SYMBOL 

3000K  0000H  PUB  ARRAY 

ARRAYDEC:  SYMBOLS  .AND  LINES 

3110H  0000H  SYM  MEMORY  3000E  0000H  SVM  A^RA^ 

3100F  0000H  LIN  3 

MEMORY  MAP  OF  MODULE  COUNTER1 
READ  FROM  FILE  :F1 :CNTR1 .INK 
WRITTEN  TO  FILS  : FI : CNTR1 .RUN 

MODULE  START  ADDRESS  PARAGRAPH  =  0100H  OFFSET  =  0008H 
SEGMENT  MAP 


START 

STOP 

LENGTH 

ALIGN 

NAME 

CLASS 

01000H 

0108EH 

008FH 

V 

COUNTERl 

CODE 

CODE 

02000E 

0200BE 

000CH 

W 

COUNTERl" 

DATA 

DATA 

03000H 

03025H 

0026H 

w 

STACK 

STACK 

30000H 

30001H 

0002H 

¥ 

ARRAYDEC 

DATA 

DATA 

31000E 

31000H 

?e00E 

W 

ARRAYDEC- 

CODE 

CODE 

31100H 

31100M 

0000H 

V 

MEMORY 

MEMORY 

L0CS6  Listing 
Figure  A-4 
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01  :  02  0000  02  0100  FB 

02  :  04  0000  00  0000  0030  CC 

03  :  02  0000  02  0100  FB 

04  :  04  0004  00  0003  0002  F3 

05  :  02  0000  02  0200  FA 

e6  :  10  0002  00  474C4F42414C20 564 14 C 554 52049 533A  AA 

07  :  02  0012  00  2020  AC  j 

I 

08  :  02  0000  02  0107  F4  j 

J 

09  :  10  0003  00  558BECE4D8F6C0017403E90300E9F3FF  7? 

10  :  0C  0013  00  8A460430S07FE6DA5DC20200  4D 

11  :  02  0000  02  0100  FB 

12  J  10  0008  00  FA2E8S1604003C06008BEC2E8E1E0600  FF 

17  :  10  0058  00  7702E8 1500FE0600007403E9E1FF2FC4  SI 

18  :  0B  0068  00  1E000026FF37E80200FBF43A 

19  :  04  0000  03  0100000SF0 

20  :  00  0000  01  FF 


0H66  Listing 
FIGURE  A-5 
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